-
Cosmological simulations of scale-dependent primordial non-Gaussianity
Authors:
Marco Baldi,
Emanuele Fondi,
Dionysios Karagiannis,
Lauro Moscardini,
Andrea Ravenni,
William R. Coulton,
Gabriel Jung,
Michele Liguori,
Marco Marinucci,
Licia Verde,
Francisco Villaescusa-Navarro,
Banjamin D. Wandelt
Abstract:
We present the results of a set of cosmological N-body simulations with standard $Λ$CDM cosmology but characterized by a scale-dependent primordial non-Gaussianity of the local type featuring a power-law dependence of the $f_{\rm NL}^{\rm loc}(k)$ at large scales followed by a saturation to a constant value at smaller scales where non-linear growth leads to the formation of collapsed cosmic struct…
▽ More
We present the results of a set of cosmological N-body simulations with standard $Λ$CDM cosmology but characterized by a scale-dependent primordial non-Gaussianity of the local type featuring a power-law dependence of the $f_{\rm NL}^{\rm loc}(k)$ at large scales followed by a saturation to a constant value at smaller scales where non-linear growth leads to the formation of collapsed cosmic structures. Such models are built to ensure consistency with current Cosmic Microwave Background bounds on primordial non-Gaussianity yet allowing for large effects of the non-Gaussian statistics on the properties of non-linear structure formation. We show the impact of such scale-dependent non-Gaussian scenarios on a wide range of properties of the resulting cosmic structures, such as the non-linear matter power spectrum, the halo and sub-halo mass functions, the concentration-mass relation, the halo and void density profiles, and we highlight for the first time that some of these models might mimic the effects of Warm Dark Matter for several of such observables
△ Less
Submitted 11 July, 2024; v1 submitted 9 July, 2024;
originally announced July 2024.
-
Denoising Diffusion Delensing Delight: Reconstructing the Non-Gaussian CMB Lensing Potential with Diffusion Models
Authors:
Thomas Flöss,
William R. Coulton,
Adriaan J. Duivenvoorden,
Francisco Villaescusa-Navarro,
Benjamin D. Wandelt
Abstract:
Optimal extraction of cosmological information from observations of the Cosmic Microwave Background critically relies on our ability to accurately undo the distortions caused by weak gravitational lensing. In this work, we demonstrate the use of denoising diffusion models in performing Bayesian lensing reconstruction. We show that score-based generative models can produce accurate, uncorrelated sa…
▽ More
Optimal extraction of cosmological information from observations of the Cosmic Microwave Background critically relies on our ability to accurately undo the distortions caused by weak gravitational lensing. In this work, we demonstrate the use of denoising diffusion models in performing Bayesian lensing reconstruction. We show that score-based generative models can produce accurate, uncorrelated samples from the CMB lensing convergence map posterior, given noisy CMB observations. To validate our approach, we compare the samples of our model to those obtained using established Hamiltonian Monte Carlo methods, which assume a Gaussian lensing potential. We then go beyond this assumption of Gaussianity, and train and validate our model on non-Gaussian lensing data, obtained by ray-tracing N-body simulations. We demonstrate that in this case, samples from our model have accurate non-Gaussian statistics beyond the power spectrum. The method provides an avenue towards more efficient and accurate lensing reconstruction, that does not rely on an approximate analytic description of the posterior probability. The reconstructed lensing maps can be used as an unbiased tracer of the matter distribution, and to improve delensing of the CMB, resulting in more precise cosmological parameter inference.
△ Less
Submitted 6 June, 2024; v1 submitted 9 May, 2024;
originally announced May 2024.
-
Bye bye, local bias: the statistics of the halo field are poorly determined by the local mass density
Authors:
Deaglan J. Bartlett,
Matthew Ho,
Benjamin D. Wandelt
Abstract:
Bias models relating the dark matter field to the spatial distribution of halos are widely used in current cosmological analyses. Many models predict halos purely from the local Eulerian matter density, yet bias models in perturbation theory require the inclusion of other local properties. We assess the validity of assuming that only the local dark matter density can be used to predict the number…
▽ More
Bias models relating the dark matter field to the spatial distribution of halos are widely used in current cosmological analyses. Many models predict halos purely from the local Eulerian matter density, yet bias models in perturbation theory require the inclusion of other local properties. We assess the validity of assuming that only the local dark matter density can be used to predict the number density of halos in a model-independent way and in the non-perturbative regime. Utilising $N$-body simulations, we study the properties of the halo counts field after spatial voxels with near-equal dark matter density have been permuted. If local-in-matter-density biasing were valid, the statistical properties of the permuted and un-permuted fields would be indistinguishable since both represent equally fair draws of the stochastic biasing model. For voxels of side length $\sim4-30\,h^{-1}{\rm\,Mpc}$ and for halos less massive than $\sim10^{15}\,h^{-1}{\rm\,M_\odot}$, we find that the permuted halo field has a scale-dependent bias with greater than 25% more power on scales relevant for current surveys. These bias models remove small-scale power by not modelling correlations between neighbouring voxels, which substantially boosts large-scale power to conserve the field's total variance. This conclusion is robust to the choice of initial conditions and cosmology. Assuming local-in-matter-density halo biasing cannot, therefore, reproduce the distribution of halos across a large range of scales and halo masses, no matter how complex the model. One must either allow the biasing to be a function of other quantities and/or remove the assumption that neighbouring voxels are statistically independent.
△ Less
Submitted 21 June, 2024; v1 submitted 1 May, 2024;
originally announced May 2024.
-
Bayesian Multi-line Intensity Mapping
Authors:
Yun-Ting Cheng,
Kailai Wang,
Benjamin D. Wandelt,
Tzu-Ching Chang,
Olivier Dore
Abstract:
Line intensity mapping (LIM) has emerged as a promising tool for probing the 3D large-scale structure (LSS) through the aggregate emission of spectral lines. The presence of interloper lines poses a crucial challenge in extracting the signal from the target line in LIM. In this work, we introduce a novel method for LIM analysis that simultaneously extracts line signals from multiple spectral lines…
▽ More
Line intensity mapping (LIM) has emerged as a promising tool for probing the 3D large-scale structure (LSS) through the aggregate emission of spectral lines. The presence of interloper lines poses a crucial challenge in extracting the signal from the target line in LIM. In this work, we introduce a novel method for LIM analysis that simultaneously extracts line signals from multiple spectral lines, utilizing the covariance of native LIM data elements defined in the spectral-angular space. We leverage correlated information from different lines to perform joint inference on all lines simultaneously, employing a Bayesian analysis framework. We present the formalism, demonstrate our technique with a mock survey setup resembling the SPHEREx deep field observation, and consider four spectral lines within the SPHEREx spectral coverage in the near infrared: H$α$, $[$\ion{O}{3}$]$, H$β$, and $[$\ion{O}{2}$]$. We demonstrate that our method can extract the power spectrum of all four lines at $\gtrsim 10σ$ level at $z<2$. For the brightest line H$α$, the $10σ$ sensitivity can be achieved out to $z\sim3$. Our technique offers a flexible framework for LIM analysis, enabling simultaneous inference of signals from multiple line emissions while accommodating diverse modeling constraints and parametrizations.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Zooming by in the CARPoolGP lane: new CAMELS-TNG simulations of zoomed-in massive halos
Authors:
Max E. Lee,
Shy Genel,
Benjamin D. Wandelt,
Benjamin Zhang,
Ana Maria Delgado,
Shivam Pandey,
Erwin T. Lau,
Christopher Carr,
Harrison Cook,
Daisuke Nagai,
Daniel Angles-Alcazar,
Francisco Villaescusa-Navarro,
Greg L. Bryan
Abstract:
Galaxy formation models within cosmological hydrodynamical simulations contain numerous parameters with non-trivial influences over the resulting properties of simulated cosmic structures and galaxy populations. It is computationally challenging to sample these high dimensional parameter spaces with simulations, particularly for halos in the high-mass end of the mass function. In this work, we dev…
▽ More
Galaxy formation models within cosmological hydrodynamical simulations contain numerous parameters with non-trivial influences over the resulting properties of simulated cosmic structures and galaxy populations. It is computationally challenging to sample these high dimensional parameter spaces with simulations, particularly for halos in the high-mass end of the mass function. In this work, we develop a novel sampling and reduced variance regression method, CARPoolGP, which leverages built-in correlations between samples in different locations of high dimensional parameter spaces to provide an efficient way to explore parameter space and generate low variance emulations of summary statistics. We use this method to extend the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) to include a set of 768 zoom-in simulations of halos in the mass range of $10^{13} - 10^{14.5} M_\odot\,h^{-1}$ that span a 28-dimensional parameter space in the IllustrisTNG model. With these simulations and the CARPoolGP emulation method, we explore parameter trends in the Compton $Y-M$, black hole mass-halo mass, and metallicity-mass relations, as well as thermodynamic profiles and quenched fractions of satellite galaxies. We use these emulations to provide a physical picture of the complex interplay between supernova and active galactic nuclei feedback. We then use emulations of the $Y-M$ relation of massive halos to perform Fisher forecasts on astrophysical parameters for future Sunyaev-Zeldovich observations and find a significant improvement in forecasted constraints. We publicly release both the simulation suite and CARPoolGP software package.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Quijote-PNG: Optimizing the summary statistics to measure Primordial non-Gaussianity
Authors:
Gabriel Jung,
Andrea Ravenni,
Michele Liguori,
Marco Baldi,
William R. Coulton,
Francisco Villaescusa-Navarro,
Benjamin D. Wandelt
Abstract:
We apply a suite of different estimators to the Quijote-PNG halo catalogues to find the best approach to constrain Primordial non-Gaussianity (PNG) at non-linear cosmological scales, up to $k_{\rm max} = 0.5 \, h\,{\rm Mpc}^{-1}$. The set of summary statistics considered in our analysis includes the power spectrum, bispectrum, halo mass function, marked power spectrum, and marked modal bispectrum.…
▽ More
We apply a suite of different estimators to the Quijote-PNG halo catalogues to find the best approach to constrain Primordial non-Gaussianity (PNG) at non-linear cosmological scales, up to $k_{\rm max} = 0.5 \, h\,{\rm Mpc}^{-1}$. The set of summary statistics considered in our analysis includes the power spectrum, bispectrum, halo mass function, marked power spectrum, and marked modal bispectrum. Marked statistics are used here for the first time in the context of PNG study. We perform a Fisher analysis to estimate their cosmological information content, showing substantial improvements when marked observables are added to the analysis. Starting from these summaries, we train deep neural networks (NN) to perform likelihood-free inference of cosmological and PNG parameters. We assess the performance of different subsets of summary statistics; in the case of $f_\mathrm{NL}^\mathrm{equil}$, we find that a combination of the power spectrum and a suitable marked power spectrum outperforms the combination of power spectrum and bispectrum, the baseline statistics usually employed in PNG analysis. A minimal pipeline to analyse the statistics we identified can be implemented either with our ML algorithm or via more traditional estimators, if these are deemed more reliable.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
syren-halofit: A fast, interpretable, high-precision formula for the $Λ$CDM nonlinear matter power spectrum
Authors:
Deaglan J. Bartlett,
Benjamin D. Wandelt,
Matteo Zennaro,
Pedro G. Ferreira,
Harry Desmond
Abstract:
Rapid and accurate evaluation of the nonlinear matter power spectrum, $P(k)$, as a function of cosmological parameters and redshift is of fundamental importance in cosmology. Analytic approximations provide an interpretable solution, yet current approximations are neither fast nor accurate relative to numerical emulators. We use symbolic regression to obtain simple analytic approximations to the n…
▽ More
Rapid and accurate evaluation of the nonlinear matter power spectrum, $P(k)$, as a function of cosmological parameters and redshift is of fundamental importance in cosmology. Analytic approximations provide an interpretable solution, yet current approximations are neither fast nor accurate relative to numerical emulators. We use symbolic regression to obtain simple analytic approximations to the nonlinear scale, $k_σ$, the effective spectral index, $n_{\rm eff}$, and the curvature, $C$, which are required for the halofit model. We then re-optimise the coefficients of halofit to fit a wide range of cosmologies and redshifts. We explore the space of analytic expressions to fit the residuals between $P(k)$ and the optimised predictions of halofit. Our results are designed to match the predictions of EuclidEmulator2, but are validated against $N$-body simulations. Our symbolic expressions for $k_σ$, $n_{\rm eff}$ and $C$ have root mean squared fractional errors of 0.8%, 0.2% and 0.3%, respectively, for redshifts below 3 and a wide range of cosmologies. The re-optimised halofit parameters reduce the root mean squared fractional error (compared to EuclidEmulator2) from 3% to below 2% for wavenumbers $k=9\times10^{-3}-9 \, h{\rm Mpc^{-1}}$. We introduce syren-halofit (symbolic-regression-enhanced halofit), an extension to halofit containing a short symbolic correction which improves this error to 1%. Our method is 2350 and 3170 times faster than current halofit and hmcode implementations, respectively, and 2680 and 64 times faster than EuclidEmulator2 (which requires running class) and the BACCO emulator. We obtain comparable accuracy to EuclidEmulator2 and BACCO when tested on $N$-body simulations. Our work greatly increases the speed and accuracy of symbolic approximations to $P(k)$, making them significantly faster than their numerical counterparts without loss of accuracy.
△ Less
Submitted 15 April, 2024; v1 submitted 27 February, 2024;
originally announced February 2024.
-
LtU-ILI: An All-in-One Framework for Implicit Inference in Astrophysics and Cosmology
Authors:
Matthew Ho,
Deaglan J. Bartlett,
Nicolas Chartier,
Carolina Cuesta-Lazaro,
Simon Ding,
Axel Lapel,
Pablo Lemos,
Christopher C. Lovell,
T. Lucas Makinen,
Chirag Modi,
Viraj Pandya,
Shivam Pandey,
Lucia A. Perez,
Benjamin Wandelt,
Greg L. Bryan
Abstract:
This paper presents the Learning the Universe Implicit Likelihood Inference (LtU-ILI) pipeline, a codebase for rapid, user-friendly, and cutting-edge machine learning (ML) inference in astrophysics and cosmology. The pipeline includes software for implementing various neural architectures, training schemata, priors, and density estimators in a manner easily adaptable to any research workflow. It i…
▽ More
This paper presents the Learning the Universe Implicit Likelihood Inference (LtU-ILI) pipeline, a codebase for rapid, user-friendly, and cutting-edge machine learning (ML) inference in astrophysics and cosmology. The pipeline includes software for implementing various neural architectures, training schemata, priors, and density estimators in a manner easily adaptable to any research workflow. It includes comprehensive validation metrics to assess posterior estimate coverage, enhancing the reliability of inferred results. Additionally, the pipeline is easily parallelizable and is designed for efficient exploration of modeling hyperparameters. To demonstrate its capabilities, we present real applications across a range of astrophysics and cosmology problems, such as: estimating galaxy cluster masses from X-ray photometry; inferring cosmology from matter power spectra and halo point clouds; characterizing progenitors in gravitational wave signals; capturing physical dust parameters from galaxy colors and luminosities; and establishing properties of semi-analytic models of galaxy formation. We also include exhaustive benchmarking and comparisons of all implemented methods as well as discussions about the challenges and pitfalls of ML inference in astronomical sciences. All code and examples are made publicly available at https://github.com/maho3/ltu-ili.
△ Less
Submitted 2 July, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
A precise symbolic emulator of the linear matter power spectrum
Authors:
Deaglan J. Bartlett,
Lukas Kammerer,
Gabriel Kronberger,
Harry Desmond,
Pedro G. Ferreira,
Benjamin D. Wandelt,
Bogdan Burlacu,
David Alonso,
Matteo Zennaro
Abstract:
Computing the matter power spectrum, $P(k)$, as a function of cosmological parameters can be prohibitively slow in cosmological analyses, hence emulating this calculation is desirable. Previous analytic approximations are insufficiently accurate for modern applications, so black-box, uninterpretable emulators are often used. We utilise an efficient genetic programming based symbolic regression fra…
▽ More
Computing the matter power spectrum, $P(k)$, as a function of cosmological parameters can be prohibitively slow in cosmological analyses, hence emulating this calculation is desirable. Previous analytic approximations are insufficiently accurate for modern applications, so black-box, uninterpretable emulators are often used. We utilise an efficient genetic programming based symbolic regression framework to explore the space of potential mathematical expressions which can approximate the power spectrum and $σ_8$. We learn the ratio between an existing low-accuracy fitting function for $P(k)$ and that obtained by solving the Boltzmann equations and thus still incorporate the physics which motivated this earlier approximation. We obtain an analytic approximation to the linear power spectrum with a root mean squared fractional error of 0.2% between $k = 9\times10^{-3} - 9 \, h{\rm \, Mpc^{-1}}$ and across a wide range of cosmological parameters, and we provide physical interpretations for various terms in the expression. Our analytic approximation is 950 times faster to evaluate than camb and 36 times faster than the neural network based matter power spectrum emulator BACCO. We also provide a simple analytic approximation for $σ_8$ with a similar accuracy, with a root mean squared fractional error of just 0.1% when evaluated across the same range of cosmologies. This function is easily invertible to obtain $A_{\rm s}$ as a function of $σ_8$ and the other cosmological parameters, if preferred. It is possible to obtain symbolic approximations to a seemingly complex function at a precision required for current and future cosmological analyses without resorting to deep-learning techniques, thus avoiding their black-box nature and large number of parameters. Our emulator will be usable long after the codes on which numerical approximations are built become outdated.
△ Less
Submitted 15 April, 2024; v1 submitted 27 November, 2023;
originally announced November 2023.
-
Taming assembly bias for primordial non-Gaussianity
Authors:
Emanuele Fondi,
Licia Verde,
Francisco Villaescusa-Navarro,
Marco Baldi,
William R. Coulton,
Gabriel Jung,
Dionysios Karagiannis,
Michele Liguori,
Andrea Ravenni,
Benjamin D. Wandelt
Abstract:
Primordial non-Gaussianity of the local type induces a strong scale-dependent bias on the clustering of halos in the late-time Universe. This signature is particularly promising to provide constraints on the non-Gaussianity parameter $f_{\rm NL}$ from galaxy surveys, as the bias amplitude grows with scale and becomes important on large, linear scales. However, there is a well-known degeneracy betw…
▽ More
Primordial non-Gaussianity of the local type induces a strong scale-dependent bias on the clustering of halos in the late-time Universe. This signature is particularly promising to provide constraints on the non-Gaussianity parameter $f_{\rm NL}$ from galaxy surveys, as the bias amplitude grows with scale and becomes important on large, linear scales. However, there is a well-known degeneracy between the real prize, the $f_{\rm NL}$ parameter, and the (non-Gaussian) assembly bias i.e., the halo formation history-dependent contribution to the amplitude of the signal, which could seriously compromise the ability of large-scale structure surveys to constrain $f_{\rm NL}$. We show how the assembly bias can be modeled and constrained, thus almost completely recovering the power of galaxy surveys to competitively constrain primordial non-Gaussianity. In particular, studying hydrodynamical simulations, we find that a proxy for the halo properties that determine assembly bias can be constructed from photometric properties of galaxies. Using a prior on the assembly bias guided by this proxy degrades the statistical errors on $f_{\rm NL}$ only mildly compared to an ideal case where the assembly bias is perfectly known. The systematic error on $f_{\rm NL}$ that the proxy induces can be safely kept under control.
△ Less
Submitted 2 February, 2024; v1 submitted 16 November, 2023;
originally announced November 2023.
-
Optimal simulation-based Bayesian decisions
Authors:
Justin Alsing,
Thomas D. P. Edwards,
Benjamin Wandelt
Abstract:
We present a framework for the efficient computation of optimal Bayesian decisions under intractable likelihoods, by learning a surrogate model for the expected utility (or its distribution) as a function of the action and data spaces. We leverage recent advances in simulation-based inference and Bayesian optimization to develop active learning schemes to choose where in parameter and action space…
▽ More
We present a framework for the efficient computation of optimal Bayesian decisions under intractable likelihoods, by learning a surrogate model for the expected utility (or its distribution) as a function of the action and data spaces. We leverage recent advances in simulation-based inference and Bayesian optimization to develop active learning schemes to choose where in parameter and action spaces to simulate. This allows us to learn the optimal action in as few simulations as possible. The resulting framework is extremely simulation efficient, typically requiring fewer model calls than the associated posterior inference task alone, and a factor of $100-1000$ more efficient than Monte-Carlo based methods. Our framework opens up new capabilities for performing Bayesian decision making, particularly in the previously challenging regime where likelihoods are intractable, and simulations expensive.
△ Less
Submitted 9 November, 2023;
originally announced November 2023.
-
Simulation-based Inference of Reionization Parameters from 3D Tomographic 21 cm Light-cone Images -- II: Application of Solid Harmonic Wavelet Scattering Transform
Authors:
Xiaosheng Zhao,
Yi Mao,
Shifan Zuo,
Benjamin D. Wandelt
Abstract:
The information regarding how the intergalactic medium is reionized by astrophysical sources is contained in the tomographic three-dimensional 21 cm images from the epoch of reionization. In Zhao et al. (2022a) ("Paper I"), we demonstrated for the first time that density estimation likelihood-free inference (DELFI) can be applied efficiently to perform a Bayesian inference of the reionization para…
▽ More
The information regarding how the intergalactic medium is reionized by astrophysical sources is contained in the tomographic three-dimensional 21 cm images from the epoch of reionization. In Zhao et al. (2022a) ("Paper I"), we demonstrated for the first time that density estimation likelihood-free inference (DELFI) can be applied efficiently to perform a Bayesian inference of the reionization parameters from the 21 cm images. Nevertheless, the 3D image data needs to be compressed into informative summaries as the input of DELFI by, e.g., a trained 3D convolutional neural network (CNN) as in Paper I (DELFI-3D CNN). Here in this paper, we introduce an alternative data compressor, the solid harmonic wavelet scattering transform (WST), which has a similar, yet fixed (i.e. no training), architecture to CNN, but we show that this approach (i.e. solid harmonic WST with DELFI) outperforms earlier analyses based on 3D 21 cm images using DELFI-3D CNN in terms of credible regions of parameters. Realistic effects, including thermal noise and residual foreground after removal, are also applied to the mock observations from the Square Kilometre Array (SKA). We show that under the same inference strategy using DELFI, the 21 cm image analysis with solid harmonic WST outperforms the 21 cm power spectrum analysis. This research serves as a proof of concept, demonstrating the potential to harness the strengths of WST and simulation-based inference to derive insights from future 21 cm light-cone image data.
△ Less
Submitted 26 October, 2023;
originally announced October 2023.
-
Sensitivity Analysis of Simulation-Based Inference for Galaxy Clustering
Authors:
Chirag Modi,
Shivam Pandey,
Matthew Ho,
ChangHoon Hahn,
Bruno R'egaldo-Saint Blancard,
Benjamin Wandelt
Abstract:
Simulation-based inference (SBI) is a promising approach to leverage high fidelity cosmological simulations and extract information from the non-Gaussian, non-linear scales that cannot be modeled analytically. However, scaling SBI to the next generation of cosmological surveys faces the computational challenge of requiring a large number of accurate simulations over a wide range of cosmologies, wh…
▽ More
Simulation-based inference (SBI) is a promising approach to leverage high fidelity cosmological simulations and extract information from the non-Gaussian, non-linear scales that cannot be modeled analytically. However, scaling SBI to the next generation of cosmological surveys faces the computational challenge of requiring a large number of accurate simulations over a wide range of cosmologies, while simultaneously encompassing large cosmological volumes at high resolution. This challenge can potentially be mitigated by balancing the accuracy and computational cost for different components of the the forward model while ensuring robust inference. To guide our steps in this, we perform a sensitivity analysis of SBI for galaxy clustering on various components of the cosmological simulations: gravity model, halo-finder and the galaxy-halo distribution models (halo-occupation distribution, HOD). We infer the $σ_8$ and $Ω_m$ using galaxy power spectrum multipoles and the bispectrum monopole assuming a galaxy number density expected from the luminous red galaxies observed using the Dark Energy Spectroscopy Instrument (DESI). We find that SBI is insensitive to changing gravity model between $N$-body simulations and particle mesh (PM) simulations. However, changing the halo-finder from friends-of-friends (FoF) to Rockstar can lead to biased estimate of $σ_8$ based on the bispectrum. For galaxy models, training SBI on more complex HOD leads to consistent inference for less complex HOD models, but SBI trained on simpler HOD models fails when applied to analyze data from a more complex HOD model. Based on our results, we discuss the outlook on cosmological simulations with a focus on applying SBI approaches to future galaxy surveys.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
Neutrino mass constraint from an Implicit Likelihood Analysis of BOSS voids
Authors:
Leander Thiele,
Elena Massara,
Alice Pisani,
ChangHoon Hahn,
David N. Spergel,
Shirley Ho,
Benjamin Wandelt
Abstract:
Cosmic voids identified in the spatial distribution of galaxies provide complementary information to two-point statistics. In particular, constraints on the neutrino mass sum, $\sum m_ν$, promise to benefit from the inclusion of void statistics. We perform inference on the CMASS NGC sample of SDSS-III/BOSS with the aim of constraining $\sum m_ν$. We utilize the void size function, the void galaxy…
▽ More
Cosmic voids identified in the spatial distribution of galaxies provide complementary information to two-point statistics. In particular, constraints on the neutrino mass sum, $\sum m_ν$, promise to benefit from the inclusion of void statistics. We perform inference on the CMASS NGC sample of SDSS-III/BOSS with the aim of constraining $\sum m_ν$. We utilize the void size function, the void galaxy cross power spectrum, and the galaxy auto power spectrum. To extract constraints from these summary statistics we use a simulation-based approach, specifically implicit likelihood inference. We populate approximate gravity-only, particle neutrino cosmological simulations with an expressive halo occupation distribution model. With a conservative scale cut of kmax=0.15 h/Mpc and a Planck-inspired LCDM prior, we find upper bounds on $\sum m_ν$ of 0.43 and 0.35 eV from the galaxy auto power spectrum and the full data vector, respectively (95% credible interval). We observe hints that the void statistics may be most effective at constraining $\sum m_ν$ from below. We also substantiate the usual assumption that the void size function is Poisson distributed.
△ Less
Submitted 14 July, 2023;
originally announced July 2023.
-
Cosmic Chronometers with Photometry: a new path to $H(z)$
Authors:
Raul Jimenez,
Michele Moresco,
Licia Verde,
Benjamin D. Wandelt
Abstract:
We present a proof-of-principle determination of the Hubble parameter $H(z)$ from photometric data, obtaining a determination at an effective redshift of $z=0.75$ ($0.65<z<0.85$) of $H(0.75) =105.0\pm 7.9(stat)\pm 7.3(sys)$ km s$^{-1}$ Mpc$^{-1}$, with 7.5\% statistical and 7\% systematic (10\% with statistical and systematics combined in quadrature) accuracy. This is obtained in a cosmology model…
▽ More
We present a proof-of-principle determination of the Hubble parameter $H(z)$ from photometric data, obtaining a determination at an effective redshift of $z=0.75$ ($0.65<z<0.85$) of $H(0.75) =105.0\pm 7.9(stat)\pm 7.3(sys)$ km s$^{-1}$ Mpc$^{-1}$, with 7.5\% statistical and 7\% systematic (10\% with statistical and systematics combined in quadrature) accuracy. This is obtained in a cosmology model-independent fashion, but assuming a linear age-redshift relation in the relevant redshift range, as such, it can be used to constrain arbitrary cosmologies as long as $H(z)$ can be considered slowly varying over redshift. In particular, we have applied a neural network, trained on a well-studied spectroscopic sample of 140 objects, to the {\tt COSMOS2015} survey to construct a set of 19 thousand near-passively evolving galaxies and build an age-redshift relation. The Hubble parameter is given by the derivative of the red envelope of the age-redshift relation. This is the first time the Hubble parameter is determined from photometry at $\lesssim 10$\% accuracy. Accurate $H(z)$ determinations could help shed light on the Hubble tension; this study shows that photometry, with a reduction of only a factor of two in the uncertainty, could provide a new perspective on the tension.
△ Less
Submitted 20 June, 2023;
originally announced June 2023.
-
Evidence Networks: simple losses for fast, amortized, neural Bayesian model comparison
Authors:
Niall Jeffrey,
Benjamin D. Wandelt
Abstract:
Evidence Networks can enable Bayesian model comparison when state-of-the-art methods (e.g. nested sampling) fail and even when likelihoods or priors are intractable or unknown. Bayesian model comparison, i.e. the computation of Bayes factors or evidence ratios, can be cast as an optimization problem. Though the Bayesian interpretation of optimal classification is well-known, here we change perspec…
▽ More
Evidence Networks can enable Bayesian model comparison when state-of-the-art methods (e.g. nested sampling) fail and even when likelihoods or priors are intractable or unknown. Bayesian model comparison, i.e. the computation of Bayes factors or evidence ratios, can be cast as an optimization problem. Though the Bayesian interpretation of optimal classification is well-known, here we change perspective and present classes of loss functions that result in fast, amortized neural estimators that directly estimate convenient functions of the Bayes factor. This mitigates numerical inaccuracies associated with estimating individual model probabilities. We introduce the leaky parity-odd power (l-POP) transform, leading to the novel ``l-POP-Exponential'' loss function. We explore neural density estimation for data probability in different models, showing it to be less accurate and scalable than Evidence Networks. Multiple real-world and synthetic examples illustrate that Evidence Networks are explicitly independent of dimensionality of the parameter space and scale mildly with the complexity of the posterior probability density function. This simple yet powerful approach has broad implications for model inference tasks. As an application of Evidence Networks to real-world data we compute the Bayes factor for two models with gravitational lensing data of the Dark Energy Survey. We briefly discuss applications of our methods to other, related problems of model comparison and evaluation in implicit inference settings.
△ Less
Submitted 10 January, 2024; v1 submitted 18 May, 2023;
originally announced May 2023.
-
Quijote-PNG: The Information Content of the Halo Mass Function
Authors:
Gabriel Jung,
Andrea Ravenni,
Marco Baldi,
William R. Coulton,
Drew Jamieson,
Dionysios Karagiannis,
Michele Liguori,
Helen Shao,
Licia Verde,
Francisco Villaescusa-Navarro,
Benjamin D. Wandelt
Abstract:
We study signatures of primordial non-Gaussianity (PNG) in the redshift-space halo field on non-linear scales, using a combination of three summary statistics, namely the halo mass function (HMF), power spectrum, and bispectrum. The choice of adding the HMF to our previous joint analysis of power spectrum and bispectrum is driven by a preliminary field-level analysis, in which we train graph neura…
▽ More
We study signatures of primordial non-Gaussianity (PNG) in the redshift-space halo field on non-linear scales, using a combination of three summary statistics, namely the halo mass function (HMF), power spectrum, and bispectrum. The choice of adding the HMF to our previous joint analysis of power spectrum and bispectrum is driven by a preliminary field-level analysis, in which we train graph neural networks on halo catalogues to infer the PNG $f_\mathrm{NL}$ parameter. The covariance matrix and the responses of our summaries to changes in model parameters are extracted from a suite of halo catalogues constructed from the Quijote-PNG N-body simulations. We consider the three main types of PNG: local, equilateral and orthogonal. Adding the HMF to our previous joint analysis of power spectrum and bispectrum produces two main effects. First, it reduces the equilateral $f_\mathrm{NL}$ predicted errors by roughly a factor $2$, while also producing notable, although smaller, improvements for orthogonal PNG. Second, it helps break the degeneracy between the local PNG amplitude, $f_\mathrm{NL}^\mathrm{local}$, and assembly bias, $b_φ$, without relying on any external prior assumption. Our final forecasts for PNG parameters are $Δf_\mathrm{NL}^\mathrm{local} = 40$, $Δf_\mathrm{NL}^\mathrm{equil} = 210$, $Δf_\mathrm{NL}^\mathrm{ortho} = 91$, on a cubic volume of $1 \left(h^{-1}{\rm Gpc}\right)^3$, with a halo number density of $\bar{n}\sim 5.1 \times 10^{-5}~h^3\mathrm{Mpc}^{-3}$, at $z = 1$, and considering scales up to $k_\mathrm{max} = 0.5~h\,\mathrm{Mpc}^{-1}$.
△ Less
Submitted 4 February, 2024; v1 submitted 17 May, 2023;
originally announced May 2023.
-
How to estimate Fisher information matrices from simulations
Authors:
William R. Coulton,
Benjamin D. Wandelt
Abstract:
The Fisher information matrix is a quantity of fundamental importance for information geometry and asymptotic statistics. In practice, it is widely used to quickly estimate the expected information available in a data set and guide experimental design choices. In many modern applications, it is intractable to analytically compute the Fisher information and Monte Carlo methods are used instead. The…
▽ More
The Fisher information matrix is a quantity of fundamental importance for information geometry and asymptotic statistics. In practice, it is widely used to quickly estimate the expected information available in a data set and guide experimental design choices. In many modern applications, it is intractable to analytically compute the Fisher information and Monte Carlo methods are used instead. The standard Monte Carlo method produces estimates of the Fisher information that can be biased when the Monte-Carlo noise is non-negligible. Most problematic is noise in the derivatives as this leads to an overestimation of the available constraining power, given by the inverse Fisher information. In this work we find another simple estimate that is oppositely biased and produces an underestimate of the constraining power. This estimator can either be used to give approximate bounds on the parameter constraints or can be combined with the standard estimator to give improved, approximately unbiased estimates. Both the alternative and the combined estimators are asymptotically unbiased so can be also used as a convergence check of the standard approach. We discuss potential limitations of these estimators and provide methods to assess their reliability. These methods accelerate the convergence of Fisher forecasts, as unbiased estimates can be achieved with fewer Monte Carlo samples, and so can be used to reduce the simulated data set size by several orders of magnitude.
△ Less
Submitted 3 June, 2023; v1 submitted 15 May, 2023;
originally announced May 2023.
-
Posterior Sampling of the Initial Conditions of the Universe from Non-linear Large Scale Structures using Score-Based Generative Models
Authors:
Ronan Legin,
Matthew Ho,
Pablo Lemos,
Laurence Perreault-Levasseur,
Shirley Ho,
Yashar Hezaveh,
Benjamin Wandelt
Abstract:
Reconstructing the initial conditions of the universe is a key problem in cosmology. Methods based on simulating the forward evolution of the universe have provided a way to infer initial conditions consistent with present-day observations. However, due to the high complexity of the inference problem, these methods either fail to sample a distribution of possible initial density fields or require…
▽ More
Reconstructing the initial conditions of the universe is a key problem in cosmology. Methods based on simulating the forward evolution of the universe have provided a way to infer initial conditions consistent with present-day observations. However, due to the high complexity of the inference problem, these methods either fail to sample a distribution of possible initial density fields or require significant approximations in the simulation model to be tractable, potentially leading to biased results. In this work, we propose the use of score-based generative models to sample realizations of the early universe given present-day observations. We infer the initial density field of full high-resolution dark matter N-body simulations from the present-day density field and verify the quality of produced samples compared to the ground truth based on summary statistics. The proposed method is capable of providing plausible realizations of the early universe density field from the initial conditions posterior distribution marginalized over cosmological parameters and can sample orders of magnitude faster than current state-of-the-art methods.
△ Less
Submitted 7 April, 2023;
originally announced April 2023.
-
Machine-learning cosmology from void properties
Authors:
Bonny Y. Wang,
Alice Pisani,
Francisco Villaescusa-Navarro,
Benjamin D. Wandelt
Abstract:
Cosmic voids are the largest and most underdense structures in the Universe. Their properties have been shown to encode precious information about the laws and constituents of the Universe. We show that machine learning techniques can unlock the information in void features for cosmological parameter inference. We rely on thousands of void catalogs from the GIGANTES dataset, where every catalog co…
▽ More
Cosmic voids are the largest and most underdense structures in the Universe. Their properties have been shown to encode precious information about the laws and constituents of the Universe. We show that machine learning techniques can unlock the information in void features for cosmological parameter inference. We rely on thousands of void catalogs from the GIGANTES dataset, where every catalog contains an average of 11,000 voids from a volume of $1~(h^{-1}{\rm Gpc})^3$. We focus on three properties of cosmic voids: ellipticity, density contrast, and radius. We train 1) fully connected neural networks on histograms from individual void properties and 2) deep sets from void catalogs, to perform likelihood-free inference on the value of cosmological parameters. We find that our best models are able to constrain the value of $Ω_{\rm m}$, $σ_8$, and $n_s$ with mean relative errors of $10\%$, $4\%$, and $3\%$, respectively, without using any spatial information from the void catalogs. Our results provide an illustration for the use of machine learning to constrain cosmology with voids.
△ Less
Submitted 6 October, 2023; v1 submitted 13 December, 2022;
originally announced December 2022.
-
A Framework for Obtaining Accurate Posteriors of Strong Gravitational Lensing Parameters with Flexible Priors and Implicit Likelihoods using Density Estimation
Authors:
Ronan Legin,
Yashar Hezaveh,
Laurence Perreault-Levasseur,
Benjamin Wandelt
Abstract:
We report the application of implicit likelihood inference to the prediction of the macro-parameters of strong lensing systems with neural networks. This allows us to perform deep learning analysis of lensing systems within a well-defined Bayesian statistical framework to explicitly impose desired priors on lensing variables, to obtain accurate posteriors, and to guarantee convergence to the optim…
▽ More
We report the application of implicit likelihood inference to the prediction of the macro-parameters of strong lensing systems with neural networks. This allows us to perform deep learning analysis of lensing systems within a well-defined Bayesian statistical framework to explicitly impose desired priors on lensing variables, to obtain accurate posteriors, and to guarantee convergence to the optimal posterior in the limit of perfect performance. We train neural networks to perform a regression task to produce point estimates of lensing parameters. We then interpret these estimates as compressed statistics in our inference setup and model their likelihood function using mixture density networks. We compare our results with those of approximate Bayesian neural networks, discuss their significance, and point to future directions. Based on a test set of 100,000 strong lensing simulations, our amortized model produces accurate posteriors for any arbitrary confidence interval, with a maximum percentage deviation of $1.4\%$ at $21.8\%$ confidence level, without the need for any added calibration procedure. In total, inferring 100,000 different posteriors takes a day on a single GPU, showing that the method scales well to the thousands of lenses expected to be discovered by upcoming sky surveys.
△ Less
Submitted 30 November, 2022;
originally announced December 2022.
-
Calibrating cosmological simulations with implicit likelihood inference using galaxy growth observables
Authors:
Yongseok Jo,
Shy Genel,
Benjamin Wandelt,
Rachel Somerville,
Francisco Villaescusa-Navarro,
Greg L. Bryan,
Daniel Angles-Alcazar,
Daniel Foreman-Mackey,
Dylan Nelson,
Ji-hoon Kim
Abstract:
In a novel approach employing implicit likelihood inference (ILI), also known as likelihood-free inference, we calibrate the parameters of cosmological hydrodynamic simulations against observations, which has previously been unfeasible due to the high computational cost of these simulations. For computational efficiency, we train neural networks as emulators on ~1000 cosmological simulations from…
▽ More
In a novel approach employing implicit likelihood inference (ILI), also known as likelihood-free inference, we calibrate the parameters of cosmological hydrodynamic simulations against observations, which has previously been unfeasible due to the high computational cost of these simulations. For computational efficiency, we train neural networks as emulators on ~1000 cosmological simulations from the CAMELS project to estimate simulated observables, taking as input the cosmological and astrophysical parameters, and use these emulators as surrogates to the cosmological simulations. Using the cosmic star formation rate density (SFRD) and, separately, stellar mass functions (SMFs) at different redshifts, we perform ILI on selected cosmological and astrophysical parameters (Omega_m, sigma_8, stellar wind feedback, and kinetic black hole feedback) and obtain full 6-dimensional posterior distributions. In the performance test, the ILI from the emulated SFRD (SMFs) can recover the target observables with a relative error of 0.17% (0.4%). We find that degeneracies exist between the parameters inferred from the emulated SFRD, confirmed with new full cosmological simulations. We also find that the SMFs can break the degeneracy in the SFRD, which indicates that the SMFs provide complementary constraints for the parameters. Further, we find that the parameter combination inferred from an observationally-inferred SFRD reproduces the target observed SFRD very well, whereas, in the case of the SMFs, the inferred and observed SMFs show significant discrepancies that indicate potential limitations of the current galaxy formation modeling and calibration framework, and/or systematic differences and inconsistencies between observations of the stellar mass function.
△ Less
Submitted 29 November, 2022;
originally announced November 2022.
-
Quijote-PNG: Quasi-maximum likelihood estimation of Primordial Non-Gaussianity in the non-linear halo density field
Authors:
Gabriel Jung,
Dionysios Karagiannis,
Michele Liguori,
Marco Baldi,
William R Coulton,
Drew Jamieson,
Licia Verde,
Francisco Villaescusa-Navarro,
Benjamin D. Wandelt
Abstract:
We study primordial non-Gaussian signatures in the redshift-space halo field on non-linear scales, using a quasi-maximum likelihood estimator based on optimally compressed power spectrum and modal bispectrum statistics. We train and validate the estimator on a suite of halo catalogues constructed from the Quijote-PNG N-body simulations, which we release to accompany this paper. We verify its unbia…
▽ More
We study primordial non-Gaussian signatures in the redshift-space halo field on non-linear scales, using a quasi-maximum likelihood estimator based on optimally compressed power spectrum and modal bispectrum statistics. We train and validate the estimator on a suite of halo catalogues constructed from the Quijote-PNG N-body simulations, which we release to accompany this paper. We verify its unbiasedness and near optimality, for the three main types of primordial non-Gaussianity (PNG): local, equilateral, and orthogonal. We compare the modal bispectrum expansion with a $k$-binning approach, showing that the former allows for faster convergence of numerical derivatives in the computation of the score-function, thus leading to better final constraints. We find, in agreement with previous studies, that the local PNG signal in the halo-field is dominated by the scale-dependent bias signature on large scales and saturates at $k \sim 0.2~h\,\mathrm{Mpc}^{-1}$, whereas the small-scale bispectrum is the main source of information for equilateral and orthogonal PNG. Combining power spectrum and bispectrum on non-linear scales plays an important role in breaking degeneracies between cosmological and PNG parameters; such degeneracies remain however strong for equilateral PNG. We forecast that PNG parameters can be constrained with $Δf_\mathrm{NL}^\mathrm{local} = 45$, $Δf_\mathrm{NL}^\mathrm{equil} = 570$, $Δf_\mathrm{NL}^\mathrm{ortho} = 110$, on a cubic volume of $1 \left({ {\rm Gpc}/{ {\rm h}}} \right)^3$, at $z = 1$, considering scales up to $k_\mathrm{max} = 0.5~h\,\mathrm{Mpc}^{-1}$.
△ Less
Submitted 18 May, 2023; v1 submitted 14 November, 2022;
originally announced November 2022.
-
Why is zero spatial curvature special?
Authors:
Raul Jimenez,
Ali Rida Khalife,
Daniel F. Litim,
Sabino Matarrese,
Benjamin D. Wandelt
Abstract:
Evidence for almost spatial flatness of the Universe has been provided from several observational probes, including the Cosmic Microwave Background (CMB) and Baryon Acoustic Oscillations (BAO) from galaxy clustering data. However, other than inflation, and in this case only in the limit of infinite time, there is no strong a priori motivation for a spatially flat Universe. Using the renormalizatio…
▽ More
Evidence for almost spatial flatness of the Universe has been provided from several observational probes, including the Cosmic Microwave Background (CMB) and Baryon Acoustic Oscillations (BAO) from galaxy clustering data. However, other than inflation, and in this case only in the limit of infinite time, there is no strong a priori motivation for a spatially flat Universe. Using the renormalization group (RG) technique in curved spacetime, we present in this work a theoretical motivation for spatial flatness. Starting from a general spacetime, the first step of the RG, coarse-graining, gives a Friedmann-Lemaître-Robertson-Walker (FLRW) metric with a set of parameters. Then, we study the rescaling properties of the curvature parameter, and find that zero spatial curvature of the FLRW metric is singled out as the unique scale-free, non-singular background for cosmological perturbations.
△ Less
Submitted 13 September, 2023; v1 submitted 18 October, 2022;
originally announced October 2022.
-
Data-driven Cosmology from Three-dimensional Light Cones
Authors:
Yun-Ting Cheng,
Benjamin D. Wandelt,
Tzu-Ching Chang,
Olivier Dore
Abstract:
We present a data-driven technique to analyze multifrequency images from upcoming cosmological surveys mapping large sky area. Using full information from the data at the two-point level, our method can simultaneously constrain the large-scale structure (LSS), the spectra and redshift distribution of emitting sources, and the noise in the observed data without any prior assumptions beyond the homo…
▽ More
We present a data-driven technique to analyze multifrequency images from upcoming cosmological surveys mapping large sky area. Using full information from the data at the two-point level, our method can simultaneously constrain the large-scale structure (LSS), the spectra and redshift distribution of emitting sources, and the noise in the observed data without any prior assumptions beyond the homogeneity and isotropy of cosmological perturbations. In particular, the method does not rely on source detection or photometric or spectroscopic redshift estimates. Here, we present the formalism and demonstrate our technique with a mock observation from nine optical and near-infrared photometric bands. Our method can recover the input signal and noise without bias, and quantify the uncertainty on the constraints. Our technique provides a flexible framework to analyze the LSS observation traced by different types of sources, which has potential for wide application to current or future cosmological datasets such as SPHEREx, Rubin Observatory, Euclid, or the Nancy Grace Roman Space Telescope.
△ Less
Submitted 28 January, 2023; v1 submitted 18 October, 2022;
originally announced October 2022.
-
Snowmass Theory Frontier: Astrophysics and Cosmology
Authors:
Daniel Green,
Joshua T. Ruderman,
Benjamin R. Safdi,
Jessie Shelton,
Ana Achúcarro,
Peter Adshead,
Yashar Akrami,
Masha Baryakhtar,
Daniel Baumann,
Asher Berlin,
Nikita Blinov,
Kimberly K. Boddy,
Malte Buschmann,
Giovanni Cabass,
Robert Caldwell,
Emanuele Castorina,
Thomas Y. Chen,
Xingang Chen,
William Coulton,
Djuna Croon,
Yanou Cui,
David Curtin,
Francis-Yan Cyr-Racine,
Christopher Dessert,
Keith R. Dienes
, et al. (62 additional authors not shown)
Abstract:
We summarize progress made in theoretical astrophysics and cosmology over the past decade and areas of interest for the coming decade. This Report is prepared as the TF09 "Astrophysics and Cosmology" topical group summary for the Theory Frontier as part of the Snowmass 2021 process.
We summarize progress made in theoretical astrophysics and cosmology over the past decade and areas of interest for the coming decade. This Report is prepared as the TF09 "Astrophysics and Cosmology" topical group summary for the Theory Frontier as part of the Snowmass 2021 process.
△ Less
Submitted 14 September, 2022;
originally announced September 2022.
-
The Cosmic Graph: Optimal Information Extraction from Large-Scale Structure using Catalogues
Authors:
T. Lucas Makinen,
Tom Charnock,
Pablo Lemos,
Natalia Porqueres,
Alan Heavens,
Benjamin D. Wandelt
Abstract:
We present an implicit likelihood approach to quantifying cosmological information over discrete catalogue data, assembled as graphs. To do so, we explore cosmological parameter constraints using mock dark matter halo catalogues. We employ Information Maximising Neural Networks (IMNNs) to quantify Fisher information extraction as a function of graph representation. We a) demonstrate the high sensi…
▽ More
We present an implicit likelihood approach to quantifying cosmological information over discrete catalogue data, assembled as graphs. To do so, we explore cosmological parameter constraints using mock dark matter halo catalogues. We employ Information Maximising Neural Networks (IMNNs) to quantify Fisher information extraction as a function of graph representation. We a) demonstrate the high sensitivity of modular graph structure to the underlying cosmology in the noise-free limit, b) show that graph neural network summaries automatically combine mass and clustering information through comparisons to traditional statistics, c) demonstrate that networks can still extract information when catalogues are subject to noisy survey cuts, and d) illustrate how nonlinear IMNN summaries can be used as asymptotically optimal compressed statistics for Bayesian simulation-based inference. We reduce the area of joint $Ω_m, σ_8$ parameter constraints with small ($\sim$100 object) halo catalogues by a factor of 42 over the two-point correlation function, and demonstrate that the networks automatically combine mass and clustering information. This work utilises a new IMNN implementation over graph data in Jax, which can take advantage of either numerical or auto-differentiability. We also show that graph IMNNs successfully compress simulations away from the fiducial model at which the network is fitted, indicating a promising alternative to n-point statistics in catalogue simulation-based analyses.
△ Less
Submitted 22 December, 2022; v1 submitted 11 July, 2022;
originally announced July 2022.
-
Quijote PNG: The information content of the halo power spectrum and bispectrum
Authors:
William R Coulton,
Francisco Villaescusa-Navarro,
Drew Jamieson,
Marco Baldi,
Gabriel Jung,
Dionysios Karagiannis,
Michele Liguori,
Licia Verde,
Benjamin D. Wandelt
Abstract:
We investigate how much can be learnt about four types of primordial non-Gaussianity (PNG) from small-scale measurements of the halo field. Using the QUIJOTE-PNG simulations, we quantify the information content accessible with measurements of the halo power spectrum monopole and quadrupole, the matter power spectrum, the halo-matter cross spectrum and the halo bispectrum monopole. This analysis is…
▽ More
We investigate how much can be learnt about four types of primordial non-Gaussianity (PNG) from small-scale measurements of the halo field. Using the QUIJOTE-PNG simulations, we quantify the information content accessible with measurements of the halo power spectrum monopole and quadrupole, the matter power spectrum, the halo-matter cross spectrum and the halo bispectrum monopole. This analysis is the first to include small, non-linear scales, up to $k_\mathrm{max}=0.5 \mathrm{h/Mpc}$, and to explore whether these scales can break degeneracies with cosmological and nuisance parameters making use of thousands of N-body simulations. We perform all the halo measurements in redshift space with a single sample comprised of all halos with mass $>3.2 \times 10^{13}~h^{-1}M_\odot$. For local PNG, measurements of the scale dependent bias effect from the power spectrum using sample variance cancellation provide significantly tighter constraints than measurements of the halo bispectrum. In this case measurements of the small scales add minimal additional constraining power. In contrast, the information on equilateral and orthogonal PNG is primarily accessible through the bispectrum. For these shapes, small scale measurements increase the constraining power of the halo bispectrum by up to $\times4$, though the addition of scales beyond $k\approx 0.3 \mathrm{h/Mpc}$ improves constraints largely through reducing degeneracies between PNG and the other parameters. These degeneracies are even more powerfully mitigated through combining power spectrum and bispectrum measurements. However even with combined measurements and small scale information, equilateral non-Gaussianity remains highly degenerate with $σ_8$ and our bias model.
△ Less
Submitted 20 December, 2022; v1 submitted 30 June, 2022;
originally announced June 2022.
-
Quijote-PNG: Quasi-maximum likelihood estimation of Primordial Non-Gaussianity in the non-linear dark matter density field
Authors:
Gabriel Jung,
Dionysios Karagiannis,
Michele Liguori,
Marco Baldi,
William R Coulton,
Drew Jamieson,
Licia Verde,
Francisco Villaescusa-Navarro,
Benjamin D. Wandelt
Abstract:
Future Large Scale Structure surveys are expected to improve over current bounds on primordial non-Gaussianity (PNG), with a significant impact on our understanding of early Universe physics. The level of such improvements will however strongly depend on the extent to which late time non-linearities erase the PNG signal on small scales. In this work, we show how much primordial information remains…
▽ More
Future Large Scale Structure surveys are expected to improve over current bounds on primordial non-Gaussianity (PNG), with a significant impact on our understanding of early Universe physics. The level of such improvements will however strongly depend on the extent to which late time non-linearities erase the PNG signal on small scales. In this work, we show how much primordial information remains in the bispectrum of the non-linear dark matter density field by implementing a new, simulation-based, methodology for joint estimation of PNG amplitudes ($f_{\rm NL}$) and standard $Λ$CDM parameters. The estimator is based on optimally compressed statistics, which, for a given input density field, combine power spectrum and modal bispectrum measurements, and numerically evaluate their covariance and their response to changes in cosmological parameters. We train and validate the estimator using a large suite of N-body simulations (QUIJOTE-PNG), including different types of PNG (local, equilateral, orthogonal). We explicitly test the estimator's unbiasedness, optimality and stability with respect to changes in the total number of input realizations. While the dark matter power spectrum itself contains negligible PNG information, as expected, including it as an ancillary statistic increases the PNG information content extracted from the bispectrum by a factor of order $2$. As a result, we prove the capability of our approach to optimally extract PNG information on non-linear scales beyond the perturbative regime, up to $k_{\rm max} = 0.5~h\,{\rm Mpc}^{-1}$, obtaining marginalized $1$-$σ$ bounds of $Δf_{\rm NL}^{\rm local} \sim 16$, $Δf_{\rm NL}^{\rm equil} \sim 77$ and $Δf_{\rm NL}^{\rm ortho} \sim 40$ on a cubic volume of $1~(\mathrm{Gpc}/h)^3$ at $z=1$. At the same time, we discuss the significant information on cosmological parameters contained on these scales.
△ Less
Submitted 3 June, 2022;
originally announced June 2022.
-
Quijote-PNG: Simulations of primordial non-Gaussianity and the information content of the matter field power spectrum and bispectrum
Authors:
William R Coulton,
Francisco Villaescusa-Navarro,
Drew Jamieson,
Marco Baldi,
Gabriel Jung,
Dionysios Karagiannis,
Michele Liguori,
Licia Verde,
Benjamin D. Wandelt
Abstract:
Primordial non-Gaussianity (PNG) is one of the most powerful probes of the early Universe and measurements of the large scale structure of the Universe have the potential to transform our understanding of this area. However relating measurements of the late time Universe to the primordial perturbations is challenging due to the non-linear processes that govern the evolution of the Universe. To hel…
▽ More
Primordial non-Gaussianity (PNG) is one of the most powerful probes of the early Universe and measurements of the large scale structure of the Universe have the potential to transform our understanding of this area. However relating measurements of the late time Universe to the primordial perturbations is challenging due to the non-linear processes that govern the evolution of the Universe. To help address this issue we release a large suite of N-body simulations containing four types of PNG: \textsc{quijote-png}. These simulations were designed to augment the \textsc{quijote} suite of simulations that explored the impact of various cosmological parameters on large scale structure observables. Using these simulations we investigate how much information on PNG can be extracted by extending power spectrum and bispectrum measurements beyond the perturbative regime at $z=0.0$. This is the first joint analysis of the PNG and cosmological information content accessible with power spectrum and bispectrum measurements of the non-linear scales. We find that the constraining power improves significantly up to $k_\mathrm{max}\approx 0.3 h/{\rm Mpc}$, with diminishing returns beyond as the statistical probes signal-to-noise ratios saturate. This saturation emphasizes the importance of accurately modelling all the contributions to the covariance matrix. Further we find that combining the two probes is a powerful method of breaking the degeneracies with the $Λ$CDM parameters.
△ Less
Submitted 26 May, 2023; v1 submitted 3 June, 2022;
originally announced June 2022.
-
Euclid: Cosmological forecasts from the void size function
Authors:
S. Contarini,
G. Verza,
A. Pisani,
N. Hamaus,
M. Sahlén,
C. Carbone,
S. Dusini,
F. Marulli,
L. Moscardini,
A. Renzi,
C. Sirignano,
L. Stanco,
M. Aubert,
M. Bonici,
G. Castignani,
H. M. Courtois,
S. Escoffier,
D. Guinet,
A. Kovacs,
G. Lavaux,
E. Massara,
S. Nadathur,
G. Pollina,
T. Ronconi,
F. Ruppin
, et al. (101 additional authors not shown)
Abstract:
The Euclid mission $-$ with its spectroscopic galaxy survey covering a sky area over $15\,000 \ \mathrm{deg}^2$ in the redshift range $0.9<z<1.8\ -$ will provide a sample of tens of thousands of cosmic voids. This paper explores for the first time the constraining power of the void size function on the properties of dark energy (DE) from a survey mock catalogue, the official Euclid Flagship simula…
▽ More
The Euclid mission $-$ with its spectroscopic galaxy survey covering a sky area over $15\,000 \ \mathrm{deg}^2$ in the redshift range $0.9<z<1.8\ -$ will provide a sample of tens of thousands of cosmic voids. This paper explores for the first time the constraining power of the void size function on the properties of dark energy (DE) from a survey mock catalogue, the official Euclid Flagship simulation. We identify voids in the Flagship light-cone, which closely matches the features of the upcoming Euclid spectroscopic data set. We model the void size function considering a state-of-the art methodology: we rely on the volume conserving (Vdn) model, a modification of the popular Sheth & van de Weygaert model for void number counts, extended by means of a linear function of the large-scale galaxy bias. We find an excellent agreement between model predictions and measured mock void number counts. We compute updated forecasts for the Euclid mission on DE from the void size function and provide reliable void number estimates to serve as a basis for further forecasts of cosmological applications using voids. We analyse two different cosmological models for DE: the first described by a constant DE equation of state parameter, $w$, and the second by a dynamic equation of state with coefficients $w_0$ and $w_a$. We forecast $1σ$ errors on $w$ lower than $10\%$, and we estimate an expected figure of merit (FoM) for the dynamical DE scenario $\mathrm{FoM}_{w_0,w_a} = 17$ when considering only the neutrino mass as additional free parameter of the model. The analysis is based on conservative assumptions to ensure full robustness, and is a pathfinder for future enhancements of the technique. Our results showcase the impressive constraining power of the void size function from the Euclid spectroscopic sample, both as a stand-alone probe, and to be combined with other Euclid cosmological probes.
△ Less
Submitted 25 November, 2022; v1 submitted 23 May, 2022;
originally announced May 2022.
-
Bayesian Control Variates for optimal covariance estimation with pairs of simulations and surrogates
Authors:
Nicolas Chartier,
Benjamin D. Wandelt
Abstract:
Predictions of the mean and covariance matrix of summary statistics are critical for confronting cosmological theories with observations, not least for likelihood approximations and parameter inference. The price to pay for accurate estimates is the extreme cost of running $N$-body and hydrodynamics simulations. Approximate solvers, or surrogates, greatly reduce the computational cost but can intr…
▽ More
Predictions of the mean and covariance matrix of summary statistics are critical for confronting cosmological theories with observations, not least for likelihood approximations and parameter inference. The price to pay for accurate estimates is the extreme cost of running $N$-body and hydrodynamics simulations. Approximate solvers, or surrogates, greatly reduce the computational cost but can introduce significant biases, for example in the non-linear regime of cosmic structure growth. We propose "CARPool Bayes", an approach to solve the inference problem for both the means and covariances using a combination of simulations and surrogates. Our framework allows incorporating prior information for the mean and covariance. We derive closed-form solutions for Maximum A Posteriori covariance estimates that are efficient Bayesian shrinkage estimators, guarantee positive semi-definiteness, and can optionally leverage analytical covariance approximations. We discuss choices of the prior and propose a simple procedure for obtaining optimal prior hyperparameter values with a small set of test simulations. We test our method by estimating the covariances of clustering statistics of GADGET-III $N$-body simulations at redshift $z=0.5$ using surrogates from a 100-1000$\times$ faster particle-mesh code. Taking the sample covariance from 15,000 simulations as the truth, and using an empirical Bayes prior with diagonal blocks, our estimator produces nearly identical Fisher matrix contours for $Λ$CDM parameters using only $15$ simulations of the non-linear dark matter power spectrum. In this case the number of simulations is so small that the sample covariance would be degenerate. We show cases where even with a naïve prior our method still improves the estimate. Our framework is applicable to a wide range of cosmological and astrophysical problems where fast surrogates are available.
△ Less
Submitted 12 April, 2022; v1 submitted 6 April, 2022;
originally announced April 2022.
-
Constraining cosmology with machine learning and galaxy clustering: the CAMELS-SAM suite
Authors:
Lucia A. Perez,
Shy Genel,
Francisco Villaescusa-Navarro,
Rachel S. Somerville,
Austen Gabrielpillai,
Daniel Anglés-Alcázar,
Benjamin D. Wandelt,
L. Y. Aaron Yung
Abstract:
As the next generation of large galaxy surveys come online, it is becoming increasingly important to develop and understand the machine learning tools that analyze big astronomical data. Neural networks are powerful and capable of probing deep patterns in data, but must be trained carefully on large and representative data sets. We developed and generated a new `hump' of the Cosmology and Astrophy…
▽ More
As the next generation of large galaxy surveys come online, it is becoming increasingly important to develop and understand the machine learning tools that analyze big astronomical data. Neural networks are powerful and capable of probing deep patterns in data, but must be trained carefully on large and representative data sets. We developed and generated a new `hump' of the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project: CAMELS-SAM, encompassing one thousand dark-matter only simulations of (100 $h^{-1}$ cMpc)$^3$ with different cosmological parameters ($Ω_m$ and $σ_8$) and run through the Santa Cruz semi-analytic model for galaxy formation over a broad range of astrophysical parameters. As a proof-of-concept for the power of this vast suite of simulated galaxies in a large volume and broad parameter space, we probe the power of simple clustering summary statistics to marginalize over astrophysics and constrain cosmology using neural networks. We use the two-point correlation function, count-in-cells, and the Void Probability Function, and probe non-linear and linear scales across $0.68<$ R $<27\ h^{-1}$ cMpc. Our cosmological constraints cluster around 3-8$\%$ error on $Ω_{\text{M}}$ and $σ_8$, and we explore the effect of various galaxy selections, galaxy sampling, and choice of clustering statistics on these constraints. We additionally explore how these clustering statistics constrain and inform key stellar and galactic feedback parameters in the Santa Cruz SAM. CAMELS-SAM has been publicly released alongside the rest of CAMELS, and offers great potential to many applications of machine learning in astrophysics: https://camels-sam.readthedocs.io.
△ Less
Submitted 22 May, 2023; v1 submitted 5 April, 2022;
originally announced April 2022.
-
Implicit Likelihood Inference of Reionization Parameters from the 21 cm Power Spectrum
Authors:
Xiaosheng Zhao,
Yi Mao,
Benjamin D. Wandelt
Abstract:
The first measurements of the 21 cm brightness temperature power spectrum from the epoch of reionization will very likely be achieved in the near future by radio interferometric array experiments such as the Hydrogen Epoch of Reionization Array (HERA) and the Square Kilometre Array (SKA). Standard MCMC analyses use an explicit likelihood approximation to infer the reionization parameters from the…
▽ More
The first measurements of the 21 cm brightness temperature power spectrum from the epoch of reionization will very likely be achieved in the near future by radio interferometric array experiments such as the Hydrogen Epoch of Reionization Array (HERA) and the Square Kilometre Array (SKA). Standard MCMC analyses use an explicit likelihood approximation to infer the reionization parameters from the 21 cm power spectrum. In this paper, we present a new Bayesian inference of the reionization parameters where the likelihood is implicitly defined through forward simulations using density estimation likelihood-free inference (DELFI). Realistic effects including thermal noise and foreground avoidance are also applied to the mock observations from the HERA and SKA. We demonstrate that this method recovers accurate posterior distributions for the reionization parameters, and outperforms the standard MCMC analysis in terms of the location and size of credible parameter regions. With the minutes-level processing time once the network is trained, this technique is a promising approach for the scientific interpretation of future 21 cm power spectrum observation data. Our code 21cmDELFI-PS is publicly available at this link.
△ Less
Submitted 10 June, 2022; v1 submitted 29 March, 2022;
originally announced March 2022.
-
Inflation: Theory and Observations
Authors:
Ana Achúcarro,
Matteo Biagetti,
Matteo Braglia,
Giovanni Cabass,
Robert Caldwell,
Emanuele Castorina,
Xingang Chen,
William Coulton,
Raphael Flauger,
Jacopo Fumagalli,
Mikhail M. Ivanov,
Hayden Lee,
Azadeh Maleknejad,
P. Daniel Meerburg,
Azadeh Moradinezhad Dizgah,
Gonzalo A. Palma,
Guilherme L. Pimentel,
Sébastien Renaux-Petel,
Benjamin Wallisch,
Benjamin D. Wandelt,
Lukas T. Witkowski,
W. L. Kimmy Wu
Abstract:
Cosmic inflation provides a window to the highest energy densities accessible in nature, far beyond those achievable in any realistic terrestrial experiment. Theoretical insights into the inflationary era and its observational probes may therefore shed unique light on the physical laws underlying our universe. This white paper describes our current theoretical understanding of the inflationary era…
▽ More
Cosmic inflation provides a window to the highest energy densities accessible in nature, far beyond those achievable in any realistic terrestrial experiment. Theoretical insights into the inflationary era and its observational probes may therefore shed unique light on the physical laws underlying our universe. This white paper describes our current theoretical understanding of the inflationary era, with a focus on the statistical properties of primordial fluctuations. In particular, we survey observational targets for three important signatures of inflation: primordial gravitational waves, primordial non-Gaussianity and primordial features. With the requisite advancements in analysis techniques, the tremendous increase in the raw sensitivities of upcoming and planned surveys will translate to leaps in our understanding of the inflationary paradigm and could open new frontiers for cosmology and particle physics. The combination of future theoretical and observational developments therefore offer the potential for a dramatic discovery about the nature of cosmic acceleration in the very early universe and physics on the smallest scales.
△ Less
Submitted 29 September, 2022; v1 submitted 15 March, 2022;
originally announced March 2022.
-
Snowmass2021 CMB-HD White Paper
Authors:
The CMB-HD Collaboration,
:,
Simone Aiola,
Yashar Akrami,
Kaustuv Basu,
Michael Boylan-Kolchin,
Thejs Brinckmann,
Sean Bryan,
Caitlin M. Casey,
Jens Chluba,
Sebastien Clesse,
Francis-Yan Cyr-Racine,
Luca Di Mascolo,
Simon Dicker,
Thomas Essinger-Hileman,
Gerrit S. Farren,
Michael A. Fedderke,
Simone Ferraro,
George M. Fuller,
Nicholas Galitzki,
Vera Gluscevic,
Daniel Grin,
Dongwon Han,
Matthew Hasselfield,
Renee Hlozek
, et al. (40 additional authors not shown)
Abstract:
CMB-HD is a proposed millimeter-wave survey over half the sky that would be ultra-deep (0.5 uK-arcmin) and have unprecedented resolution (15 arcseconds at 150 GHz). Such a survey would answer many outstanding questions about the fundamental physics of the Universe. Major advances would be 1.) the use of gravitational lensing of the primordial microwave background to map the distribution of matter…
▽ More
CMB-HD is a proposed millimeter-wave survey over half the sky that would be ultra-deep (0.5 uK-arcmin) and have unprecedented resolution (15 arcseconds at 150 GHz). Such a survey would answer many outstanding questions about the fundamental physics of the Universe. Major advances would be 1.) the use of gravitational lensing of the primordial microwave background to map the distribution of matter on small scales (k~10 h Mpc^(-1)), which probes dark matter particle properties. It will also allow 2.) measurements of the thermal and kinetic Sunyaev-Zel'dovich effects on small scales to map the gas density and velocity, another probe of cosmic structure. In addition, CMB-HD would allow us to cross critical thresholds: 3.) ruling out or detecting any new, light (< 0.1 eV) particles that were in thermal equilibrium with known particles in the early Universe, 4.) testing a wide class of multi-field models that could explain an epoch of inflation in the early Universe, and 5.) ruling out or detecting inflationary magnetic fields. CMB-HD would also provide world-leading constraints on 6.) axion-like particles, 7.) cosmic birefringence, 8.) the sum of the neutrino masses, and 9.) the dark energy equation of state. The CMB-HD survey would be delivered in 7.5 years of observing 20,000 square degrees of sky, using two new 30-meter-class off-axis crossed Dragone telescopes to be located at Cerro Toco in the Atacama Desert. Each telescope would field 800,000 detectors (200,000 pixels), for a total of 1.6 million detectors.
△ Less
Submitted 10 March, 2022;
originally announced March 2022.
-
Cross-correlating dark sirens and galaxies: measurement of $H_0$ from GWTC-3 of LIGO-Virgo-KAGRA
Authors:
Suvodip Mukherjee,
Alex Krolewski,
Benjamin D. Wandelt,
Joseph Silk
Abstract:
We measure the Hubble constant of the Universe using spatial cross-correlation between gravitational wave (GW) sources without electromagnetic counterparts from the third GW Transient Catalog (GWTC-3), and the photometric galaxy surveys 2MPZ and WISE-SuperCOSMOS. Using the eight well-localised GW events, we obtain Hubble constant $H_0= 68.2_{-6.2}^{+26.0}$ km/s/Mpc (median and 68.3$\%$ equal-taile…
▽ More
We measure the Hubble constant of the Universe using spatial cross-correlation between gravitational wave (GW) sources without electromagnetic counterparts from the third GW Transient Catalog (GWTC-3), and the photometric galaxy surveys 2MPZ and WISE-SuperCOSMOS. Using the eight well-localised GW events, we obtain Hubble constant $H_0= 68.2_{-6.2}^{+26.0}$ km/s/Mpc (median and 68.3$\%$ equal-tailed interval (ETI)) after marginalizing over the matter density and the GW bias parameters. Though the constraints are weak due to a limited number of GW sources and poor sky localization, they are not subject to assumptions regarding the GW mass distribution. By combining this measurement with the Hubble constant measurement from binary neutron star GW170817, we find a value of Hubble constant $H_0= 67.0_{-3.8}^{+6.3}$ km/s/Mpc (median and 68.3$\%$ ETI).
△ Less
Submitted 7 March, 2022;
originally announced March 2022.
-
Breaking baryon-cosmology degeneracy with the electron density power spectrum
Authors:
Andrina Nicola,
Francisco Villaescusa-Navarro,
David N. Spergel,
Jo Dunkley,
Daniel Anglés-Alcázar,
Romeel Davé,
Shy Genel,
Lars Hernquist,
Daisuke Nagai,
Rachel S. Somerville,
Benjamin D. Wandelt
Abstract:
Uncertain feedback processes in galaxies affect the distribution of matter, currently limiting the power of weak lensing surveys. If we can identify cosmological statistics that are robust against these uncertainties, or constrain these effects by other means, then we can enhance the power of current and upcoming observations from weak lensing surveys such as DES, Euclid, the Rubin Observatory, an…
▽ More
Uncertain feedback processes in galaxies affect the distribution of matter, currently limiting the power of weak lensing surveys. If we can identify cosmological statistics that are robust against these uncertainties, or constrain these effects by other means, then we can enhance the power of current and upcoming observations from weak lensing surveys such as DES, Euclid, the Rubin Observatory, and the Roman Space Telescope. In this work, we investigate the potential of the electron density auto-power spectrum as a robust probe of cosmology and baryonic feedback. We use a suite of (magneto-)hydrodynamic simulations from the CAMELS project and perform an idealized analysis to forecast statistical uncertainties on a limited set of cosmological and physically-motivated astrophysical parameters. We find that the electron number density auto-correlation, measurable through either kinematic Sunyaev-Zel'dovich observations or through Fast Radio Burst dispersion measures, provides tight constraints on $Ω_{m}$ and the mean baryon fraction in intermediate-mass halos, $\bar{f}_{\mathrm{bar}}$. By obtaining an empirical measure for the associated systematic uncertainties, we find these constraints to be largely robust to differences in baryonic feedback models implemented in hydrodynamic simulations. We further discuss the main caveats associated with our analysis, and point out possible directions for future work.
△ Less
Submitted 11 January, 2022;
originally announced January 2022.
-
Rubin-Euclid Derived Data Products: Initial Recommendations
Authors:
Leanne P. Guy,
Jean-Charles Cuillandre,
Etienne Bachelet,
Manda Banerji,
Franz E. Bauer,
Thomas Collett,
Christopher J. Conselice,
Siegfried Eggl,
Annette Ferguson,
Adriano Fontana,
Catherine Heymans,
Isobel M. Hook,
Éric Aubourg,
Hervé Aussel,
James Bosch,
Benoit Carry,
Henk Hoekstra,
Konrad Kuijken,
Francois Lanusse,
Peter Melchior,
Joseph Mohr,
Michele Moresco,
Reiko Nakajima,
Stéphane Paltani,
Michael Troxel
, et al. (95 additional authors not shown)
Abstract:
This report is the result of a joint discussion between the Rubin and Euclid scientific communities. The work presented in this report was focused on designing and recommending an initial set of Derived Data products (DDPs) that could realize the science goals enabled by joint processing. All interested Rubin and Euclid data rights holders were invited to contribute via an online discussion forum…
▽ More
This report is the result of a joint discussion between the Rubin and Euclid scientific communities. The work presented in this report was focused on designing and recommending an initial set of Derived Data products (DDPs) that could realize the science goals enabled by joint processing. All interested Rubin and Euclid data rights holders were invited to contribute via an online discussion forum and a series of virtual meetings. Strong interest in enhancing science with joint DDPs emerged from across a wide range of astrophysical domains: Solar System, the Galaxy, the Local Volume, from the nearby to the primaeval Universe, and cosmology.
△ Less
Submitted 13 October, 2022; v1 submitted 11 January, 2022;
originally announced January 2022.
-
The CAMELS project: public data release
Authors:
Francisco Villaescusa-Navarro,
Shy Genel,
Daniel Anglés-Alcázar,
Lucia A. Perez,
Pablo Villanueva-Domingo,
Digvijay Wadekar,
Helen Shao,
Faizan G. Mohammad,
Sultan Hassan,
Emily Moser,
Erwin T. Lau,
Luis Fernando Machado Poletti Valle,
Andrina Nicola,
Leander Thiele,
Yongseok Jo,
Oliver H. E. Philcox,
Benjamin D. Oppenheimer,
Megan Tillman,
ChangHoon Hahn,
Neerav Kaushal,
Alice Pisani,
Matthew Gebhardt,
Ana Maria Delgado,
Joyce Caliendo,
Christina Kreisch
, et al. (22 additional authors not shown)
Abstract:
The Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project was developed to combine cosmology with astrophysics through thousands of cosmological hydrodynamic simulations and machine learning. CAMELS contains 4,233 cosmological simulations, 2,049 N-body and 2,184 state-of-the-art hydrodynamic simulations that sample a vast volume in parameter space. In this paper we present…
▽ More
The Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project was developed to combine cosmology with astrophysics through thousands of cosmological hydrodynamic simulations and machine learning. CAMELS contains 4,233 cosmological simulations, 2,049 N-body and 2,184 state-of-the-art hydrodynamic simulations that sample a vast volume in parameter space. In this paper we present the CAMELS public data release, describing the characteristics of the CAMELS simulations and a variety of data products generated from them, including halo, subhalo, galaxy, and void catalogues, power spectra, bispectra, Lyman-$α$ spectra, probability distribution functions, halo radial profiles, and X-rays photon lists. We also release over one thousand catalogues that contain billions of galaxies from CAMELS-SAM: a large collection of N-body simulations that have been combined with the Santa Cruz Semi-Analytic Model. We release all the data, comprising more than 350 terabytes and containing 143,922 snapshots, millions of halos, galaxies and summary statistics. We provide further technical details on how to access, download, read, and process the data at \url{https://camels.readthedocs.io}.
△ Less
Submitted 4 January, 2022;
originally announced January 2022.
-
Simulation-Based Inference of Strong Gravitational Lensing Parameters
Authors:
Ronan Legin,
Yashar Hezaveh,
Laurence Perreault Levasseur,
Benjamin Wandelt
Abstract:
In the coming years, a new generation of sky surveys, in particular, Euclid Space Telescope (2022), and the Rubin Observatory's Legacy Survey of Space and Time (LSST, 2023) will discover more than 200,000 new strong gravitational lenses, which represents an increase of more than two orders of magnitude compared to currently known sample sizes. Accurate and fast analysis of such large volumes of da…
▽ More
In the coming years, a new generation of sky surveys, in particular, Euclid Space Telescope (2022), and the Rubin Observatory's Legacy Survey of Space and Time (LSST, 2023) will discover more than 200,000 new strong gravitational lenses, which represents an increase of more than two orders of magnitude compared to currently known sample sizes. Accurate and fast analysis of such large volumes of data under a statistical framework is therefore crucial for all sciences enabled by strong lensing. Here, we report on the application of simulation-based inference methods, in particular, density estimation techniques, to the predictions of the set of parameters of strong lensing systems from neural networks. This allows us to explicitly impose desired priors on lensing parameters, while guaranteeing convergence to the optimal posterior in the limit of perfect performance.
△ Less
Submitted 14 June, 2022; v1 submitted 9 December, 2021;
originally announced December 2021.
-
Single frequency CMB B-mode inference with realistic foregrounds from a single training image
Authors:
Niall Jeffrey,
François Boulanger,
Benjamin D. Wandelt,
Bruno Regaldo-Saint Blancard,
Erwan Allys,
François Levrier
Abstract:
With a single training image and using wavelet phase harmonic augmentation, we present polarized Cosmic Microwave Background (CMB) foreground marginalization in a high-dimensional likelihood-free (Bayesian) framework. We demonstrate robust foreground removal using only a single frequency of simulated data for a BICEP-like sky patch. Using Moment Networks we estimate the pixel-level posterior proba…
▽ More
With a single training image and using wavelet phase harmonic augmentation, we present polarized Cosmic Microwave Background (CMB) foreground marginalization in a high-dimensional likelihood-free (Bayesian) framework. We demonstrate robust foreground removal using only a single frequency of simulated data for a BICEP-like sky patch. Using Moment Networks we estimate the pixel-level posterior probability for the underlying {E,B} signal and validate the statistical model with a quantile-type test using the estimated marginal posterior moments. The Moment Networks use a hierarchy of U-Net convolutional neural networks. This work validates such an approach in the most difficult limiting case: pixel-level, noise-free, highly non-Gaussian dust foregrounds with a single training image at a single frequency. For a real CMB experiment, a small number of representative sky patches would provide the training data required for full cosmological inference. These results enable robust likelihood-free, simulation-based parameter and model inference for primordial B-mode detection using observed CMB polarization data.
△ Less
Submitted 1 November, 2021;
originally announced November 2021.
-
GLADE+: An Extended Galaxy Catalogue for Multimessenger Searches with Advanced Gravitational-wave Detectors
Authors:
G. Dálya,
R. Díaz,
F. R. Bouchet,
Z. Frei,
J. Jasche,
G. Lavaux,
R. Macas,
S. Mukherjee,
M. Pálfi,
R. S. de Souza,
B. D. Wandelt,
M. Bilicki,
P. Raffai
Abstract:
We present GLADE+, an extended version of the GLADE galaxy catalogue introduced in our previous paper for multimessenger searches with advanced gravitational-wave detectors. GLADE+ combines data from six separate but not independent astronomical catalogues: the GWGC, 2MPZ, 2MASS XSC, HyperLEDA, and WISExSCOSPZ galaxy catalogues, and the SDSS-DR16Q quasar catalogue. To allow corrections of CMB-fram…
▽ More
We present GLADE+, an extended version of the GLADE galaxy catalogue introduced in our previous paper for multimessenger searches with advanced gravitational-wave detectors. GLADE+ combines data from six separate but not independent astronomical catalogues: the GWGC, 2MPZ, 2MASS XSC, HyperLEDA, and WISExSCOSPZ galaxy catalogues, and the SDSS-DR16Q quasar catalogue. To allow corrections of CMB-frame redshifts for peculiar motions, we calculated peculiar velocities along with their standard deviations of all galaxies having $B$-band magnitude data within redshift $z=0.05$ using the "Bayesian Origin Reconstruction from Galaxies" formalism. GLADE+ is complete up to luminosity distance $d_L=47^{+4}_{-2}$ Mpc in terms of the total expected $B$-band luminosity of galaxies, and contains all of the brightest galaxies giving 90\% of the total $B$-band and $K$-band luminosity up to $d_L\simeq 130$ Mpc. We include estimations of stellar masses and individual binary neutron star merger rates for galaxies with $W1$ magnitudes. These parameters can help in ranking galaxies in a given gravitational wave localization volume in terms of their likelihood of being hosts, thereby possibly reducing the number of pointings and total integration time needed to find the electromagnetic counterpart.
△ Less
Submitted 2 June, 2022; v1 submitted 12 October, 2021;
originally announced October 2021.
-
HIFlow: Generating Diverse HI Maps and Inferring Cosmology while Marginalizing over Astrophysics using Normalizing Flows
Authors:
Sultan Hassan,
Francisco Villaescusa-Navarro,
Benjamin Wandelt,
David N. Spergel,
Daniel Anglés-Alcázar,
Shy Genel,
Miles Cranmer,
Greg L. Bryan,
Romeel Davé,
Rachel S. Somerville,
Michael Eickenberg,
Desika Narayanan,
Shirley Ho,
Sambatra Andrianomena
Abstract:
A wealth of cosmological and astrophysical information is expected from many ongoing and upcoming large-scale surveys. It is crucial to prepare for these surveys now and develop tools that can efficiently extract most information. We present HIFlow: a fast generative model of the neutral hydrogen (HI) maps that is conditioned only on cosmology ($Ω_{m}$ and $σ_{8}$) and designed using a class of no…
▽ More
A wealth of cosmological and astrophysical information is expected from many ongoing and upcoming large-scale surveys. It is crucial to prepare for these surveys now and develop tools that can efficiently extract most information. We present HIFlow: a fast generative model of the neutral hydrogen (HI) maps that is conditioned only on cosmology ($Ω_{m}$ and $σ_{8}$) and designed using a class of normalizing flow models, the Masked Autoregressive Flow (MAF). HIFlow is trained on the state-of-the-art simulations from the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project. HIFlow has the ability to generate realistic diverse maps without explicitly incorporating the expected 2D maps structure into the flow as an inductive bias. We find that HIFlow is able to reproduce the CAMELS average and standard deviation HI power spectrum (Pk) within a factor of $\lesssim$ 2, scoring a very high $R^{2} > 90\%$. By inverting the flow, HIFlow provides a tractable high-dimensional likelihood for efficient parameter inference. We show that the conditional HIFlow on cosmology is successfully able to marginalize over astrophysics at the field level, regardless of the stellar and AGN feedback strengths. This new tool represents a first step toward a more powerful parameter inference, maximizing the scientific return of future HI surveys, and opening a new avenue to minimize the loss of complex information due to data compression down to summary statistics.
△ Less
Submitted 18 August, 2022; v1 submitted 6 October, 2021;
originally announced October 2021.
-
The CAMELS Multifield Dataset: Learning the Universe's Fundamental Parameters with Artificial Intelligence
Authors:
Francisco Villaescusa-Navarro,
Shy Genel,
Daniel Angles-Alcazar,
Leander Thiele,
Romeel Dave,
Desika Narayanan,
Andrina Nicola,
Yin Li,
Pablo Villanueva-Domingo,
Benjamin Wandelt,
David N. Spergel,
Rachel S. Somerville,
Jose Manuel Zorrilla Matilla,
Faizan G. Mohammad,
Sultan Hassan,
Helen Shao,
Digvijay Wadekar,
Michael Eickenberg,
Kaze W. K. Wong,
Gabriella Contardo,
Yongseok Jo,
Emily Moser,
Erwin T. Lau,
Luis Fernando Machado Poletti Valle,
Lucia A. Perez
, et al. (3 additional authors not shown)
Abstract:
We present the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) Multifield Dataset, CMD, a collection of hundreds of thousands of 2D maps and 3D grids containing many different properties of cosmic gas, dark matter, and stars from 2,000 distinct simulated universes at several cosmic times. The 2D maps and 3D grids represent cosmic regions that span $\sim$100 million light year…
▽ More
We present the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) Multifield Dataset, CMD, a collection of hundreds of thousands of 2D maps and 3D grids containing many different properties of cosmic gas, dark matter, and stars from 2,000 distinct simulated universes at several cosmic times. The 2D maps and 3D grids represent cosmic regions that span $\sim$100 million light years and have been generated from thousands of state-of-the-art hydrodynamic and gravity-only N-body simulations from the CAMELS project. Designed to train machine learning models, CMD is the largest dataset of its kind containing more than 70 Terabytes of data. In this paper we describe CMD in detail and outline a few of its applications. We focus our attention on one such task, parameter inference, formulating the problems we face as a challenge to the community. We release all data and provide further technical details at https://camels-multifield-dataset.readthedocs.io.
△ Less
Submitted 22 September, 2021;
originally announced September 2021.
-
Robust marginalization of baryonic effects for cosmological inference at the field level
Authors:
Francisco Villaescusa-Navarro,
Shy Genel,
Daniel Angles-Alcazar,
David N. Spergel,
Yin Li,
Benjamin Wandelt,
Leander Thiele,
Andrina Nicola,
Jose Manuel Zorrilla Matilla,
Helen Shao,
Sultan Hassan,
Desika Narayanan,
Romeel Dave,
Mark Vogelsberger
Abstract:
We train neural networks to perform likelihood-free inference from $(25\,h^{-1}{\rm Mpc})^2$ 2D maps containing the total mass surface density from thousands of hydrodynamic simulations of the CAMELS project. We show that the networks can extract information beyond one-point functions and power spectra from all resolved scales ($\gtrsim 100\,h^{-1}{\rm kpc}$) while performing a robust marginalizat…
▽ More
We train neural networks to perform likelihood-free inference from $(25\,h^{-1}{\rm Mpc})^2$ 2D maps containing the total mass surface density from thousands of hydrodynamic simulations of the CAMELS project. We show that the networks can extract information beyond one-point functions and power spectra from all resolved scales ($\gtrsim 100\,h^{-1}{\rm kpc}$) while performing a robust marginalization over baryonic physics at the field level: the model can infer the value of $Ω_{\rm m} (\pm 4\%)$ and $σ_8 (\pm 2.5\%)$ from simulations completely different to the ones used to train it.
△ Less
Submitted 21 September, 2021;
originally announced September 2021.
-
Multifield Cosmology with Artificial Intelligence
Authors:
Francisco Villaescusa-Navarro,
Daniel Anglés-Alcázar,
Shy Genel,
David N. Spergel,
Yin Li,
Benjamin Wandelt,
Andrina Nicola,
Leander Thiele,
Sultan Hassan,
Jose Manuel Zorrilla Matilla,
Desika Narayanan,
Romeel Dave,
Mark Vogelsberger
Abstract:
Astrophysical processes such as feedback from supernovae and active galactic nuclei modify the properties and spatial distribution of dark matter, gas, and galaxies in a poorly understood way. This uncertainty is one of the main theoretical obstacles to extract information from cosmological surveys. We use 2,000 state-of-the-art hydrodynamic simulations from the CAMELS project spanning a wide vari…
▽ More
Astrophysical processes such as feedback from supernovae and active galactic nuclei modify the properties and spatial distribution of dark matter, gas, and galaxies in a poorly understood way. This uncertainty is one of the main theoretical obstacles to extract information from cosmological surveys. We use 2,000 state-of-the-art hydrodynamic simulations from the CAMELS project spanning a wide variety of cosmological and astrophysical models and generate hundreds of thousands of 2-dimensional maps for 13 different fields: from dark matter to gas and stellar properties. We use these maps to train convolutional neural networks to extract the maximum amount of cosmological information while marginalizing over astrophysical effects at the field level. Although our maps only cover a small area of $(25~h^{-1}{\rm Mpc})^2$, and the different fields are contaminated by astrophysical effects in very different ways, our networks can infer the values of $Ω_{\rm m}$ and $σ_8$ with a few percent level precision for most of the fields. We find that the marginalization performed by the network retains a wealth of cosmological information compared to a model trained on maps from gravity-only N-body simulations that are not contaminated by astrophysical effects. Finally, we train our networks on multifields -- 2D maps that contain several fields as different colors or channels -- and find that not only they can infer the value of all parameters with higher accuracy than networks trained on individual fields, but they can constrain the value of $Ω_{\rm m}$ with higher accuracy than the maps from the N-body simulations.
△ Less
Submitted 20 September, 2021;
originally announced September 2021.
-
Euclid: Forecasts from redshift-space distortions and the Alcock-Paczynski test with cosmic voids
Authors:
N. Hamaus,
M. Aubert,
A. Pisani,
S. Contarini,
G. Verza,
M. -C. Cousinou,
S. Escoffier,
A. Hawken,
G. Lavaux,
G. Pollina,
B. D. Wandelt,
J. Weller,
M. Bonici,
C. Carbone,
L. Guzzo,
A. Kovacs,
F. Marulli,
E. Massara,
L. Moscardini,
P. Ntelis,
W. J. Percival,
S. Radinović,
M. Sahlén,
Z. Sakr,
A. G. Sánchez
, et al. (105 additional authors not shown)
Abstract:
Euclid is poised to survey galaxies across a cosmological volume of unprecedented size, providing observations of more than a billion objects distributed over a third of the full sky. Approximately 20 million of these galaxies will have their spectroscopy available, allowing us to map the 3D large-scale structure of the Universe in great detail. This paper investigates prospects for the detection…
▽ More
Euclid is poised to survey galaxies across a cosmological volume of unprecedented size, providing observations of more than a billion objects distributed over a third of the full sky. Approximately 20 million of these galaxies will have their spectroscopy available, allowing us to map the 3D large-scale structure of the Universe in great detail. This paper investigates prospects for the detection of cosmic voids therein and the unique benefit they provide for cosmology. In particular, we study the imprints of dynamic and geometric distortions of average void shapes and their constraining power on the growth of structure and cosmological distance ratios. To this end, we made use of the Flagship mock catalog, a state-of-the-art simulation of the data expected to be observed with Euclid. We arranged the data into four adjacent redshift bins, each of which contains about 11000 voids and estimated the stacked void-galaxy cross-correlation function in every bin. Fitting a linear-theory model to the data, we obtained constraints on $f/b$ and $D_M H$, where $f$ is the linear growth rate of density fluctuations, $b$ the galaxy bias, $D_M$ the comoving angular diameter distance, and $H$ the Hubble rate. In addition, we marginalized over two nuisance parameters included in our model to account for unknown systematic effects. With this approach, Euclid will be able to reach a relative precision of about 4% on measurements of $f/b$ and 0.5% on $D_M H$ in each redshift bin. Better modeling or calibration of the nuisance parameters may further increase this precision to 1% and 0.4%, respectively. Our results show that the exploitation of cosmic voids in Euclid will provide competitive constraints on cosmology even as a stand-alone probe. For example, the equation-of-state parameter $w$ for dark energy will be measured with a precision of about 10%, consistent with previous more approximate forecasts.
△ Less
Submitted 2 December, 2021; v1 submitted 23 August, 2021;
originally announced August 2021.
-
Lossless, Scalable Implicit Likelihood Inference for Cosmological Fields
Authors:
T. Lucas Makinen,
Tom Charnock,
Justin Alsing,
Benjamin D. Wandelt
Abstract:
We present a comparison of simulation-based inference to full, field-based analytical inference in cosmological data analysis. To do so, we explore parameter inference for two cases where the information content is calculable analytically: Gaussian random fields whose covariance depends on parameters through the power spectrum; and correlated lognormal fields with cosmological power spectra. We co…
▽ More
We present a comparison of simulation-based inference to full, field-based analytical inference in cosmological data analysis. To do so, we explore parameter inference for two cases where the information content is calculable analytically: Gaussian random fields whose covariance depends on parameters through the power spectrum; and correlated lognormal fields with cosmological power spectra. We compare two inference techniques: i) explicit field-level inference using the known likelihood and ii) implicit likelihood inference with maximally informative summary statistics compressed via Information Maximising Neural Networks (IMNNs). We find that a) summaries obtained from convolutional neural network compression do not lose information and therefore saturate the known field information content, both for the Gaussian covariance and the lognormal cases, b) simulation-based inference using these maximally informative nonlinear summaries recovers nearly losslessly the exact posteriors of field-level inference, bypassing the need to evaluate expensive likelihoods or invert covariance matrices, and c) even for this simple example, implicit, simulation-based likelihood incurs a much smaller computational cost than inference with an explicit likelihood. This work uses a new IMNNs implementation in $\texttt{Jax}$ that can take advantage of fully-differentiable simulation and inference pipeline. We also demonstrate that a single retraining of the IMNN summaries effectively achieves the theoretically maximal information, enhancing the robustness to the choice of fiducial model where the IMNN is trained.
△ Less
Submitted 17 July, 2021; v1 submitted 15 July, 2021;
originally announced July 2021.
-
The GIGANTES dataset: precision cosmology from voids in the machine learning era
Authors:
Christina D. Kreisch,
Alice Pisani,
Francisco Villaescusa-Navarro,
David N. Spergel,
Benjamin D. Wandelt,
Nico Hamaus,
Adrian E. Bayer
Abstract:
We present GIGANTES, the most extensive and realistic void catalog suite ever released -- containing over 1 billion cosmic voids covering a volume larger than the observable Universe, more than 20 TB of data, and created by running the void finder VIDE on QUIJOTE's halo simulations. The expansive and detailed GIGANTES suite, spanning thousands of cosmological models, opens up the study of voids, a…
▽ More
We present GIGANTES, the most extensive and realistic void catalog suite ever released -- containing over 1 billion cosmic voids covering a volume larger than the observable Universe, more than 20 TB of data, and created by running the void finder VIDE on QUIJOTE's halo simulations. The expansive and detailed GIGANTES suite, spanning thousands of cosmological models, opens up the study of voids, answering compelling questions: Do voids carry unique cosmological information? How is this information correlated with galaxy information? Leveraging the large number of voids in the GIGANTES suite, our Fisher constraints demonstrate voids contain additional information, critically tightening constraints on cosmological parameters. We use traditional void summary statistics (void size function, void density profile) and the void auto-correlation function, which independently yields an error of $0.13\,\mathrm{eV}$ on $\sum\,m_ν$ for a 1 $h^{-3}\mathrm{Gpc}^3$ simulation, without CMB priors. Combining halos and voids we forecast an error of $0.09\,\mathrm{eV}$ from the same volume. Extrapolating to next generation multi-Gpc$^3$ surveys such as DESI, Euclid, SPHEREx, and the Roman Space Telescope, we expect voids should yield an independent determination of neutrino mass. Crucially, GIGANTES is the first void catalog suite expressly built for intensive machine learning exploration. We illustrate this by training a neural network to perform likelihood-free inference on the void size function. Cosmology problems provide an impetus to develop novel deep learning techniques, leveraging the symmetries embedded throughout the universe from physical laws, interpreting models, and accurately predicting errors. With GIGANTES, machine learning gains an impressive dataset, offering unique problems that will stimulate new techniques.
△ Less
Submitted 22 July, 2021; v1 submitted 5 July, 2021;
originally announced July 2021.