subscribe to arXiv mailings

Bayesian inference: More than Bayes's theorem

Authors: Thomas J. Loredo, Robert L. Wolpert

Abstract: Bayesian inference gets its name from *Bayes's theorem*, expressing posterior probabilities for hypotheses about a data generating process as the (normalized) product of prior probabilities and a likelihood function. But Bayesian inference uses all of probability theory, not just Bayes's theorem. Many hypotheses of scientific interest are *composite hypotheses*, with the strength of evidence for t… ▽ More Bayesian inference gets its name from *Bayes's theorem*, expressing posterior probabilities for hypotheses about a data generating process as the (normalized) product of prior probabilities and a likelihood function. But Bayesian inference uses all of probability theory, not just Bayes's theorem. Many hypotheses of scientific interest are *composite hypotheses*, with the strength of evidence for the hypothesis dependent on knowledge about auxiliary factors, such as the values of nuisance parameters (e.g., uncertain background rates or calibration factors). Many important capabilities of Bayesian methods arise from use of the law of total probability, which instructs analysts to compute probabilities for composite hypotheses by *marginalization* over auxiliary factors. This tutorial targets relative newcomers to Bayesian inference, aiming to complement tutorials that focus on Bayes's theorem and how priors modulate likelihoods. The emphasis here is on marginalization over parameter spaces -- both how it is the foundation for important capabilities, and how it may motivate caution when parameter spaces are large. Topics covered include the difference between likelihood and probability, understanding the impact of priors beyond merely shifting the maximum likelihood estimate, and the role of marginalization in accounting for uncertainty in nuisance parameters, systematic error, and model misspecification. △ Less

Submitted 28 June, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

Comments: 35 pages, 11 figures; accepted for publication in Frontiers in Astronomy and Space Sciences (special issue for iid2022: Statistical Methods for Event Data - Illuminating the Dynamic Universe); fixed minor typo

arXiv:1807.09273 [pdf, other]

Statistical challenges in the search for dark matter

Authors: Sara Algeri, Melissa van Beekveld, Nassim Bozorgnia, Alyson Brooks, J. Alberto Casas, Jessi Cisewski-Kehe, Francis-Yan Cyr-Racine, Thomas D. P. Edwards, Fabio Iocco, Bradley J. Kavanagh, Judita Mamužić, Siddharth Mishra-Sharma, Wolfgang Rau, Roberto Ruiz de Austri, Benjamin R. Safdi, Pat Scott, Tracy R. Slatyer, Yue-Lin Sming Tsai, Aaron C. Vincent, Christoph Weniger, Jennifer Rittenhouse West, Robert L. Wolpert

Abstract: The search for the particle nature of dark matter has given rise to a number of experimental, theoretical and statistical challenges. Here, we report on a number of these statistical challenges and new techniques to address them, as discussed in the DMStat workshop held Feb 26 - Mar 3 2018 at the Banff International Research Station for Mathematical Innovation and Discovery (BIRS) in Banff, Albert… ▽ More The search for the particle nature of dark matter has given rise to a number of experimental, theoretical and statistical challenges. Here, we report on a number of these statistical challenges and new techniques to address them, as discussed in the DMStat workshop held Feb 26 - Mar 3 2018 at the Banff International Research Station for Mathematical Innovation and Discovery (BIRS) in Banff, Alberta. △ Less

Submitted 24 July, 2018; originally announced July 2018.

Comments: 32 pages, 8 figures, 331 references

Report number: IPPP/18/60

arXiv:1711.01318 [pdf, other]

Improving Exoplanet Detection Power: Multivariate Gaussian Process Models for Stellar Activity

Authors: David E. Jones, David C. Stenning, Eric B. Ford, Robert L. Wolpert, Thomas J. Loredo, Christian Gilbertson, Xavier Dumusque

Abstract: The radial velocity method is one of the most successful techniques for detecting exoplanets. It works by detecting the velocity of a host star induced by the gravitational effect of an orbiting planet, specifically the velocity along our line of sight, which is called the radial velocity of the star. Low-mass planets typically cause their host star to move with radial velocities of 1 m/s or less.… ▽ More The radial velocity method is one of the most successful techniques for detecting exoplanets. It works by detecting the velocity of a host star induced by the gravitational effect of an orbiting planet, specifically the velocity along our line of sight, which is called the radial velocity of the star. Low-mass planets typically cause their host star to move with radial velocities of 1 m/s or less. By analyzing a time series of stellar spectra from a host star, modern astronomical instruments can in theory detect such planets. However, in practice, intrinsic stellar variability (e.g., star spots, convective motion, pulsations) affects the spectra and often mimics a radial velocity signal. This signal contamination makes it difficult to reliably detect low-mass planets. A principled approach to recovering planet radial velocity signals in the presence of stellar activity was proposed by Rajpaul et al. (2015). It uses a multivariate Gaussian process model to jointly capture time series of the apparent radial velocity and multiple indicators of stellar activity. We build on this work in two ways: (i) we propose using dimension reduction techniques to construct new high-information stellar activity indicators; and (ii) we extend the Rajpaul et al. (2015) model to a larger class of models and use a power-based model comparison procedure to select the best model. Despite significant interest in exoplanets, previous efforts have not performed large-scale stellar activity model selection or attempted to evaluate models based on planet detection power. In the case of main sequence G2V stars, we find that our method substantially improves planet detection power compared to previous state-of-the-art approaches. △ Less

Submitted 25 August, 2020; v1 submitted 3 November, 2017; originally announced November 2017.

Comments: 37 pages, 7 figures

arXiv:1311.2587 [pdf, ps, other]

doi 10.1088/0004-637X/787/1/20

Dissecting galaxy formation models with Sensitivity Analysis -- A new approach to constrain the Milky Way formation history

Authors: Facundo A. Gómez, Christopher E. Coleman-Smith, Brian W. O'Shea, Jason Tumlinson, Robert. L. Wolpert

Abstract: [Abridged] We present an application of a statistical tool known as Sensitivity Analysis to characterize the relationship between input parameters and observational predictions of semi-analytic models of galaxy formation coupled to cosmological $N$-body simulations. We show how a sensitivity analysis can be performed on our chemo-dynamical model, ChemTreeN, to characterize and quantify its relatio… ▽ More [Abridged] We present an application of a statistical tool known as Sensitivity Analysis to characterize the relationship between input parameters and observational predictions of semi-analytic models of galaxy formation coupled to cosmological $N$-body simulations. We show how a sensitivity analysis can be performed on our chemo-dynamical model, ChemTreeN, to characterize and quantify its relationship between model input parameters and predicted observable properties. The result of this analysis provides the user with information about which parameters are most important and most likely to affect the prediction of a given observable. It can also be used to simplify models by identifying input parameters that have no effect on the outputs of interest. Conversely, it allows us to identify what model parameters can be most efficiently constrained by the given observational data set. We have applied this technique to real observational data sets associated with the Milky Way, such as the luminosity function of satellite galaxies. A statistical comparison of model outputs and real observables is used to obtain a "best-fitting" parameter set. We consider different Milky Way-like dark matter halos to account for the dependence of the best-fitting parameters selection process on the underlying merger history of the models. For all formation histories considered, running ChemTreeN with best-fitting parameters produced luminosity functions that tightly fit their observed counterpart. However, only one of the resulting stellar halo models was able to reproduce the observed stellar halo mass within 40 kpc of the Galactic center. On the basis of this analysis it is possible to disregard certain models, and their corresponding merger histories, as good representations of the underlying merger history of the Milky Way. △ Less

Submitted 24 April, 2014; v1 submitted 11 November, 2013; originally announced November 2013.

Comments: Accepted for publication in ApJ; 18 pages, 14 figures

arXiv:1308.5957 [pdf, ps, other]

A template for describing intrinsic GRB pulse shapes

Authors: Jon Hakkila, Thomas J. Loredo, Robert L. Wolpert, Mary E. Broadbent, Robert. D. Preece

Abstract: A preliminary study of a set of well-isolated pulses in GRB light curves indicates that simple pulse models, with smooth and monotonic pulse rise and decay regions, are inadequate. Examining the residuals of fits of pulses to such models suggests the following patterns of departure from the smooth pulse model of Norris et al. (2005): A Precursor Shelf occurs prior to or concurrent with the exponen… ▽ More A preliminary study of a set of well-isolated pulses in GRB light curves indicates that simple pulse models, with smooth and monotonic pulse rise and decay regions, are inadequate. Examining the residuals of fits of pulses to such models suggests the following patterns of departure from the smooth pulse model of Norris et al. (2005): A Precursor Shelf occurs prior to or concurrent with the exponential Rapid Rise. The pulse reaches maximum intensity at the Peak Plateau, then undergoes a Rapid Decay. The decay changes into an Extended Tail. Pulses are almost universally characterized by hard-to-soft evolution, arguing that the new pulse features reflect a single physical phenomenon, rather than artifacts of pulse overlap. △ Less

Submitted 27 August, 2013; originally announced August 2013.

Comments: 6 pages, 3 figures. 7th Huntsville Gamma-Ray Burst Symposium, GRB 2013: paper 24 in eConf Proceedings C1304143

Journal ref: J. Hakkila, et al., in Proceedings of the 7th Huntsville Gamma-Ray Burst Symposium, Nashville, Tennessee, USA, 2013, edited by N. Gehrels, M. S. Briggs and V. Connaughton, eConf C1304143, 24, 2013 [arXiv:1308.5957]

arXiv:1209.2142 [pdf, other]

doi 10.1088/0004-637X/760/2/112

Characterizing the formation history of Milky Way-like stellar haloes with model emulators

Authors: Facundo A. Gómez, Christopher E. Coleman-Smith, Brian W. O'Shea, Jason Tumlinson, Robert. L. Wolpert

Abstract: We use the semi-analytic model ChemTreeN, coupled to cosmological N-body simulations, to explore how different galaxy formation histories can affect observational properties of Milky Way-like galaxies' stellar haloes and their satellite populations. Gaussian processes are used to generate model emulators that allow one to statistically estimate a desired set of model outputs at any location of a p… ▽ More We use the semi-analytic model ChemTreeN, coupled to cosmological N-body simulations, to explore how different galaxy formation histories can affect observational properties of Milky Way-like galaxies' stellar haloes and their satellite populations. Gaussian processes are used to generate model emulators that allow one to statistically estimate a desired set of model outputs at any location of a p-dimensional input parameter space. This enables one to explore the full input parameter space orders of magnitude faster than could be done otherwise. Using mock observational data sets generated by ChemTreeN itself, we show that it is possible to successfully recover the input parameter vectors used to generate the mock observables if the merger history of the host halo is known. However, our results indicate that for a given observational data set the determination of "best fit" parameters is highly susceptible to the particular merger history of the host. Very different halo merger histories can reproduce the same observational dataset, if the "best fit" parameters are allowed to vary from history to history. Thus, attempts to characterize the formation history of the Milky Way using these kind of techniques must be performed statistically, analyzing large samples of high resolution N-body simulations. △ Less

Submitted 10 September, 2012; originally announced September 2012.

Comments: Accepted for publication in ApJ; 18 pages, 12 figures

Showing 1–6 of 6 results for author: Wolpert, R L