-
Real-time gravitational-wave inference for binary neutron stars using machine learning
Authors:
Maximilian Dax,
Stephen R. Green,
Jonathan Gair,
Nihar Gupte,
Michael Pürrer,
Vivien Raymond,
Jonas Wildberger,
Jakob H. Macke,
Alessandra Buonanno,
Bernhard Schölkopf
Abstract:
Mergers of binary neutron stars (BNSs) emit signals in both the gravitational-wave (GW) and electromagnetic (EM) spectra. Famously, the 2017 multi-messenger observation of GW170817 led to scientific discoveries across cosmology, nuclear physics, and gravity. Central to these results were the sky localization and distance obtained from GW data, which, in the case of GW170817, helped to identify the…
▽ More
Mergers of binary neutron stars (BNSs) emit signals in both the gravitational-wave (GW) and electromagnetic (EM) spectra. Famously, the 2017 multi-messenger observation of GW170817 led to scientific discoveries across cosmology, nuclear physics, and gravity. Central to these results were the sky localization and distance obtained from GW data, which, in the case of GW170817, helped to identify the associated EM transient, AT 2017gfo, 11 hours after the GW signal. Fast analysis of GW data is critical for directing time-sensitive EM observations; however, due to challenges arising from the length and complexity of signals, it is often necessary to make approximations that sacrifice accuracy. Here, we develop a machine learning approach that performs complete BNS inference in just one second without making any such approximations. This is enabled by a new method for explicit integration of physical domain knowledge into neural networks. Our approach enhances multi-messenger observations by providing (i) accurate localization even before the merger; (ii) improved localization precision by $\sim30\%$ compared to approximate low-latency methods; and (iii) detailed information on luminosity distance, inclination, and masses, which can be used to prioritize expensive telescope time. Additionally, the flexibility and reduced cost of our method open new opportunities for equation-of-state and waveform systematics studies. Finally, we demonstrate that our method scales to extremely long signals, up to an hour in length, thus serving as a blueprint for data analysis for next-generation ground- and space-based detectors.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Use the 4S (Signal-Safe Speckle Subtraction): Explainable Machine Learning reveals the Giant Exoplanet AF Lep b in High-Contrast Imaging Data from 2011
Authors:
Markus J. Bonse,
Timothy D. Gebhard,
Felix A. Dannert,
Olivier Absil,
Faustine Cantalloube,
Valentin Christiaens,
Gabriele Cugno,
Emily O. Garvin,
Jean Hayoz,
Markus Kasper,
Elisabeth Matthews,
Bernhard Schölkopf,
Sascha P. Quanz
Abstract:
The main challenge of exoplanet high-contrast imaging (HCI) is to separate the signal of exoplanets from their host stars, which are many orders of magnitude brighter. HCI for ground-based observations is further exacerbated by speckle noise originating from perturbations in the Earth's atmosphere and imperfections in the telescope optics. Various data post-processing techniques are used to remove…
▽ More
The main challenge of exoplanet high-contrast imaging (HCI) is to separate the signal of exoplanets from their host stars, which are many orders of magnitude brighter. HCI for ground-based observations is further exacerbated by speckle noise originating from perturbations in the Earth's atmosphere and imperfections in the telescope optics. Various data post-processing techniques are used to remove this speckle noise and reveal the faint planet signal. Often, however, a significant part of the planet signal is accidentally subtracted together with the noise. In the present work, we use explainable machine learning to investigate the reason for the loss of the planet signal for one of the most used post-processing methods: Principal Component Analysis (PCA). We find that PCA learns the shape of the telescope point spread function for high numbers of PCA components. This representation of the noise captures not only the speckle noise, but also the characteristic shape of the planet signal. Building upon these insights, we develop a new post-processing method (4S) that constrains the noise model to minimize this signal loss. We apply our model to 11 archival HCI datasets from the VLT-NACO instrument in the L'-band and find that our model consistently outperforms PCA. The improvement is largest at close separations to the star ($\leq 4 λ/D$) providing up to 1.5 magnitudes deeper contrast. This enhancement enables us to detect the exoplanet AF Lep b in data from 2011, 11 years before its subsequent discovery. We present updated orbital parameters for this object.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Evidence for eccentricity in the population of binary black holes observed by LIGO-Virgo-KAGRA
Authors:
Nihar Gupte,
Antoni Ramos-Buades,
Alessandra Buonanno,
Jonathan Gair,
M. Coleman Miller,
Maximilian Dax,
Stephen R. Green,
Michael Pürrer,
Jonas Wildberger,
Jakob Macke,
Bernhard Schölkopf
Abstract:
Binary black holes (BBHs) in eccentric orbits produce distinct modulations the emitted gravitational waves (GWs). The measurement of orbital eccentricity can provide robust evidence for dynamical binary formation channels. We analyze 57 GW events from the first, second and third observing runs of the LIGO-Virgo-KAGRA (LVK) Collaboration using a multipolar aligned-spin inspiral-merger-ringdown wave…
▽ More
Binary black holes (BBHs) in eccentric orbits produce distinct modulations the emitted gravitational waves (GWs). The measurement of orbital eccentricity can provide robust evidence for dynamical binary formation channels. We analyze 57 GW events from the first, second and third observing runs of the LIGO-Virgo-KAGRA (LVK) Collaboration using a multipolar aligned-spin inspiral-merger-ringdown waveform model with two eccentric parameters: eccentricity and relativistic anomaly. This is made computationally feasible with the machine-learning code DINGO which accelerates inference by 2-3 orders of magnitude compared to traditional inference. First, we find eccentric aligned-spin versus quasi-circular aligned-spin $\log_{10}$ Bayes factors of 1.84 to 4.75 (depending on the glitch mitigation) for GW200129, 3.0 for GW190701 and 1.77 for GW200208_22. We measure $e_{\text{gw}, 10Hz}$ to be $0.27_{-0.12}^{+0.10}$ to $0.17_{-0.13}^{+0.14}$ for GW200129, $0.35_{-0.11}^{+0.32}$ for GW190701 and $0.35_{-0.21}^{+0.18}$ for GW200208_22. Second, we find $\log_{10}$ Bayes factors between the eccentric aligned-spin versus quasi-circular precessing-spin hypothesis between 1.43 and 4.92 for GW200129, 2.61 for GW190701 and 1.23 for GW200208_22. Third, our analysis does not show evidence for eccentricity in GW190521, which has an eccentric aligned-spin against quasi-circular aligned-spin $\log_{10}$ Bayes factor of 0.04. Fourth, we estimate that if we neglect the spin-precession and use an astrophysical prior, the probability of one out of the 57 events being eccentric is greater than 99.5% or $(100 - 8.4 \times 10^{-4})$% (depending on the glitch mitigation). Fifth, we study the impact on parameter estimation when neglecting either eccentricity or higher modes in eccentric models. These results underscore the importance of including eccentric parameters in the characterization of BBHs for GW detectors.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
Inferring Atmospheric Properties of Exoplanets with Flow Matching and Neural Importance Sampling
Authors:
Timothy D. Gebhard,
Jonas Wildberger,
Maximilian Dax,
Daniel Angerhausen,
Sascha P. Quanz,
Bernhard Schölkopf
Abstract:
Atmospheric retrievals (AR) characterize exoplanets by estimating atmospheric parameters from observed light spectra, typically by framing the task as a Bayesian inference problem. However, traditional approaches such as nested sampling are computationally expensive, thus sparking an interest in solutions based on machine learning (ML). In this ongoing work, we first explore flow matching posterio…
▽ More
Atmospheric retrievals (AR) characterize exoplanets by estimating atmospheric parameters from observed light spectra, typically by framing the task as a Bayesian inference problem. However, traditional approaches such as nested sampling are computationally expensive, thus sparking an interest in solutions based on machine learning (ML). In this ongoing work, we first explore flow matching posterior estimation (FMPE) as a new ML-based method for AR and find that, in our case, it is more accurate than neural posterior estimation (NPE), but less accurate than nested sampling. We then combine both FMPE and NPE with importance sampling, in which case both methods outperform nested sampling in terms of accuracy and simulation efficiency. Going forward, our analysis suggests that simulation-based inference with likelihood-based importance sampling provides a framework for accurate and efficient AR that may become a valuable tool not only for the analysis of observational data from existing telescopes, but also for the development of new missions and instruments.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
Parameterizing pressure-temperature profiles of exoplanet atmospheres with neural networks
Authors:
Timothy D. Gebhard,
Daniel Angerhausen,
Björn S. Konrad,
Eleonora Alei,
Sascha P. Quanz,
Bernhard Schölkopf
Abstract:
Atmospheric retrievals (AR) of exoplanets typically rely on a combination of a Bayesian inference technique and a forward simulator to estimate atmospheric properties from an observed spectrum. A key component in simulating spectra is the pressure-temperature (PT) profile, which describes the thermal structure of the atmosphere. Current AR pipelines commonly use ad hoc fitting functions here that…
▽ More
Atmospheric retrievals (AR) of exoplanets typically rely on a combination of a Bayesian inference technique and a forward simulator to estimate atmospheric properties from an observed spectrum. A key component in simulating spectra is the pressure-temperature (PT) profile, which describes the thermal structure of the atmosphere. Current AR pipelines commonly use ad hoc fitting functions here that limit the retrieved PT profiles to simple approximations, but still use a relatively large number of parameters. In this work, we introduce a conceptually new, data-driven parameterization scheme for physically consistent PT profiles that does not require explicit assumptions about the functional form of the PT profiles and uses fewer parameters than existing methods. Our approach consists of a latent variable model (based on a neural network) that learns a distribution over functions (PT profiles). Each profile is represented by a low-dimensional vector that can be used to condition a decoder network that maps $P$ to $T$. When training and evaluating our method on two publicly available datasets of self-consistent PT profiles, we find that our method achieves, on average, better fit quality than existing baseline methods, despite using fewer parameters. In an AR based on existing literature, our model (using two parameters) produces a tighter, more accurate posterior for the PT profile than the five-parameter polynomial baseline, while also speeding up the retrieval by more than a factor of three. By providing parametric access to physically consistent PT profiles, and by reducing the number of parameters required to describe a PT profile (thereby reducing computational cost or freeing resources for additional parameters of interest), our method can help improve AR and thus our understanding of exoplanet atmospheres and their habitability.
△ Less
Submitted 6 September, 2023;
originally announced September 2023.
-
Towards fully covariant machine learning
Authors:
Soledad Villar,
David W. Hogg,
Weichi Yao,
George A. Kevrekidis,
Bernhard Schölkopf
Abstract:
Any representation of data involves arbitrary investigator choices. Because those choices are external to the data-generating process, each choice leads to an exact symmetry, corresponding to the group of transformations that takes one possible representation to another. These are the passive symmetries; they include coordinate freedom, gauge symmetry, and units covariance, all of which have led t…
▽ More
Any representation of data involves arbitrary investigator choices. Because those choices are external to the data-generating process, each choice leads to an exact symmetry, corresponding to the group of transformations that takes one possible representation to another. These are the passive symmetries; they include coordinate freedom, gauge symmetry, and units covariance, all of which have led to important results in physics. In machine learning, the most visible passive symmetry is the relabeling or permutation symmetry of graphs. Our goal is to understand the implications for machine learning of the many passive symmetries in play. We discuss dos and don'ts for machine learning practice if passive symmetries are to be respected. We discuss links to causal modeling, and argue that the implementation of passive symmetries is particularly valuable when the goal of the learning problem is to generalize out of sample. This paper is conceptual: It translates among the languages of physics, mathematics, and machine-learning. We believe that consideration and implementation of passive symmetries might help machine learning in the same ways that it transformed physics in the twentieth century.
△ Less
Submitted 28 June, 2023; v1 submitted 31 January, 2023;
originally announced January 2023.
-
Adapting to noise distribution shifts in flow-based gravitational-wave inference
Authors:
Jonas Wildberger,
Maximilian Dax,
Stephen R. Green,
Jonathan Gair,
Michael Pürrer,
Jakob H. Macke,
Alessandra Buonanno,
Bernhard Schölkopf
Abstract:
Deep learning techniques for gravitational-wave parameter estimation have emerged as a fast alternative to standard samplers $\unicode{x2013}$ producing results of comparable accuracy. These approaches (e.g., DINGO) enable amortized inference by training a normalizing flow to represent the Bayesian posterior conditional on observed data. By conditioning also on the noise power spectral density (PS…
▽ More
Deep learning techniques for gravitational-wave parameter estimation have emerged as a fast alternative to standard samplers $\unicode{x2013}$ producing results of comparable accuracy. These approaches (e.g., DINGO) enable amortized inference by training a normalizing flow to represent the Bayesian posterior conditional on observed data. By conditioning also on the noise power spectral density (PSD) they can even account for changing detector characteristics. However, training such networks requires knowing in advance the distribution of PSDs expected to be observed, and therefore can only take place once all data to be analyzed have been gathered. Here, we develop a probabilistic model to forecast future PSDs, greatly increasing the temporal scope of DINGO networks. Using PSDs from the second LIGO-Virgo observing run (O2) $\unicode{x2013}$ plus just a single PSD from the beginning of the third (O3) $\unicode{x2013}$ we show that we can train a DINGO network to perform accurate inference throughout O3 (on 37 real events). We therefore expect this approach to be a key component to enable the use of deep learning techniques for low-latency analyses of gravitational waves.
△ Less
Submitted 16 November, 2022;
originally announced November 2022.
-
Neural Importance Sampling for Rapid and Reliable Gravitational-Wave Inference
Authors:
Maximilian Dax,
Stephen R. Green,
Jonathan Gair,
Michael Pürrer,
Jonas Wildberger,
Jakob H. Macke,
Alessandra Buonanno,
Bernhard Schölkopf
Abstract:
We combine amortized neural posterior estimation with importance sampling for fast and accurate gravitational-wave inference. We first generate a rapid proposal for the Bayesian posterior using neural networks, and then attach importance weights based on the underlying likelihood and prior. This provides (1) a corrected posterior free from network inaccuracies, (2) a performance diagnostic (the sa…
▽ More
We combine amortized neural posterior estimation with importance sampling for fast and accurate gravitational-wave inference. We first generate a rapid proposal for the Bayesian posterior using neural networks, and then attach importance weights based on the underlying likelihood and prior. This provides (1) a corrected posterior free from network inaccuracies, (2) a performance diagnostic (the sample efficiency) for assessing the proposal and identifying failure cases, and (3) an unbiased estimate of the Bayesian evidence. By establishing this independent verification and correction mechanism we address some of the most frequent criticisms against deep learning for scientific inference. We carry out a large study analyzing 42 binary black hole mergers observed by LIGO and Virgo with the SEOBNRv4PHM and IMRPhenomXPHM waveform models. This shows a median sample efficiency of $\approx 10\%$ (two orders-of-magnitude better than standard samplers) as well as a ten-fold reduction in the statistical uncertainty in the log evidence. Given these advantages, we expect a significant impact on gravitational-wave inference, and for this approach to serve as a paradigm for harnessing deep learning methods in scientific applications.
△ Less
Submitted 30 May, 2023; v1 submitted 11 October, 2022;
originally announced October 2022.
-
A Generative Model for Quasar Spectra
Authors:
Anna-Christina Eilers,
David W. Hogg,
Bernhard Schölkopf,
Daniel Foreman-Mackey,
Frederick B. Davies,
Jan-Torge Schindler
Abstract:
We build a multi-output generative model for quasar spectra and the properties of their black hole engines, based on a Gaussian process latent-variable model. This model treats every quasar as a vector of latent properties such that the spectrum and all physical properties of the quasar are associated with non-linear functions of those latent parameters; the Gaussian process kernel functions defin…
▽ More
We build a multi-output generative model for quasar spectra and the properties of their black hole engines, based on a Gaussian process latent-variable model. This model treats every quasar as a vector of latent properties such that the spectrum and all physical properties of the quasar are associated with non-linear functions of those latent parameters; the Gaussian process kernel functions define priors on the function space. Our generative model is trained with a justifiable likelihood function that allows us to treat heteroscedastic noise and missing data correctly, which is crucial for all astrophysical applications. It can predict simultaneously unobserved spectral regions, as well as the physical properties of quasars in held-out test data. We apply the model to rest-frame ultraviolet and optical quasar spectra for which precise black hole masses (based on reverberation mapping measurements) are available. Unlike reverberation-mapping studies, which require multi-epoch data, our model predicts black hole masses from single-epoch spectra, even with limited spectral coverage. We demonstrate the capabilities of the model by predicting black hole masses and unobserved spectral regions. We find that we predict black hole masses at close to the best possible accuracy.
△ Less
Submitted 6 September, 2022;
originally announced September 2022.
-
Half-sibling regression meets exoplanet imaging: PSF modeling and subtraction using a flexible, domain knowledge-driven, causal framework
Authors:
Timothy D. Gebhard,
Markus J. Bonse,
Sascha P. Quanz,
Bernhard Schölkopf
Abstract:
High-contrast imaging of exoplanets hinges on powerful post-processing methods to denoise the data and separate the signal of a companion from its host star, which is typically orders of magnitude brighter. Existing post-processing algorithms do not use all prior domain knowledge that is available about the problem. We propose a new method that builds on our understanding of the systematic noise a…
▽ More
High-contrast imaging of exoplanets hinges on powerful post-processing methods to denoise the data and separate the signal of a companion from its host star, which is typically orders of magnitude brighter. Existing post-processing algorithms do not use all prior domain knowledge that is available about the problem. We propose a new method that builds on our understanding of the systematic noise and the causal structure of the data-generating process. Our algorithm is based on a modified version of half-sibling regression (HSR), a flexible denoising framework that combines ideas from the fields of machine learning and causality. We adapt the method to address the specific requirements of high-contrast exoplanet imaging data obtained in pupil tracking mode. The key idea is to estimate the systematic noise in a pixel by regressing the time series of this pixel onto a set of causally independent, signal-free predictor pixels. We use regularized linear models in this work; however, other (non-linear) models are also possible. In a second step, we demonstrate how the HSR framework allows us to incorporate observing conditions such as wind speed or air temperature as additional predictors. When we apply our method to four data sets from the VLT/NACO instrument, our algorithm provides a better false-positive fraction than PCA-based PSF subtraction, a popular baseline method in the field. Additionally, we find that the HSR-based method provides direct and accurate estimates for the contrast of the exoplanets without the need to insert artificial companions for calibration in the data sets. Finally, we present first evidence that using the observing conditions as additional predictors can improve the results. Our HSR-based method provides an alternative, flexible and promising approach to the challenge of modeling and subtracting the stellar PSF and systematic noise in exoplanet imaging data.
△ Less
Submitted 7 April, 2022;
originally announced April 2022.
-
Group equivariant neural posterior estimation
Authors:
Maximilian Dax,
Stephen R. Green,
Jonathan Gair,
Michael Deistler,
Bernhard Schölkopf,
Jakob H. Macke
Abstract:
Simulation-based inference with conditional neural density estimators is a powerful approach to solving inverse problems in science. However, these methods typically treat the underlying forward model as a black box, with no way to exploit geometric properties such as equivariances. Equivariances are common in scientific models, however integrating them directly into expressive inference networks…
▽ More
Simulation-based inference with conditional neural density estimators is a powerful approach to solving inverse problems in science. However, these methods typically treat the underlying forward model as a black box, with no way to exploit geometric properties such as equivariances. Equivariances are common in scientific models, however integrating them directly into expressive inference networks (such as normalizing flows) is not straightforward. We here describe an alternative method to incorporate equivariances under joint transformations of parameters and data. Our method -- called group equivariant neural posterior estimation (GNPE) -- is based on self-consistently standardizing the "pose" of the data while estimating the posterior over parameters. It is architecture-independent, and applies both to exact and approximate equivariances. As a real-world application, we use GNPE for amortized inference of astrophysical binary black hole systems from gravitational-wave observations. We show that GNPE achieves state-of-the-art accuracy while reducing inference times by three orders of magnitude.
△ Less
Submitted 30 May, 2023; v1 submitted 25 November, 2021;
originally announced November 2021.
-
The unpopular Package: a Data-driven Approach to De-trend TESS Full Frame Image Light Curves
Authors:
Soichiro Hattori,
Daniel Foreman-Mackey,
David W. Hogg,
Benjamin T. Montet,
Ruth Angus,
T. A. Pritchard,
Jason L. Curtis,
Bernhard Schölkopf
Abstract:
The majority of observed pixels on the Transiting Exoplanet Survey Satellite (TESS) are delivered in the form of full frame images (FFI). However, the FFIs contain systematic effects such as pointing jitter and scattered light from the Earth and Moon that must be removed before downstream analysis. We present unpopular, an open-source Python package to de-trend TESS FFI light curves based on the c…
▽ More
The majority of observed pixels on the Transiting Exoplanet Survey Satellite (TESS) are delivered in the form of full frame images (FFI). However, the FFIs contain systematic effects such as pointing jitter and scattered light from the Earth and Moon that must be removed before downstream analysis. We present unpopular, an open-source Python package to de-trend TESS FFI light curves based on the causal pixel model method. Under the assumption that shared flux variations across multiple distant pixels are likely to be systematics, unpopular removes these common (i.e., popular) trends by modeling the systematics in a given pixel's light curve as a linear combination of light curves from many other distant pixels. To prevent overfitting we employ ridge regression and a train-and-test framework where the data points being de-trended are separated from those used to obtain the model coefficients. We also allow for simultaneous fitting with a polynomial model to capture any long-term astrophysical trends. We validate our method by de-trending different sources (e.g., supernova, tidal disruption event, exoplanet-hosting star, fast rotating star) and comparing our light curves to those obtained by other pipelines when appropriate. We also show that unpopular is able to preserve sector-length astrophysical signals, allowing for the extraction of multi-sector light curves from the FFI data. The unpopular source code and tutorials are freely available online.
△ Less
Submitted 4 April, 2022; v1 submitted 28 June, 2021;
originally announced June 2021.
-
Real-time gravitational-wave science with neural posterior estimation
Authors:
Maximilian Dax,
Stephen R. Green,
Jonathan Gair,
Jakob H. Macke,
Alessandra Buonanno,
Bernhard Schölkopf
Abstract:
We demonstrate unprecedented accuracy for rapid gravitational-wave parameter estimation with deep learning. Using neural networks as surrogates for Bayesian posterior distributions, we analyze eight gravitational-wave events from the first LIGO-Virgo Gravitational-Wave Transient Catalog and find very close quantitative agreement with standard inference codes, but with inference times reduced from…
▽ More
We demonstrate unprecedented accuracy for rapid gravitational-wave parameter estimation with deep learning. Using neural networks as surrogates for Bayesian posterior distributions, we analyze eight gravitational-wave events from the first LIGO-Virgo Gravitational-Wave Transient Catalog and find very close quantitative agreement with standard inference codes, but with inference times reduced from O(day) to a minute per event. Our networks are trained using simulated data, including an estimate of the detector-noise characteristics near the event. This encodes the signal and noise models within millions of neural-network parameters, and enables inference for any observed data consistent with the training distribution, accounting for noise nonstationarity from event to event. Our algorithm -- called "DINGO" -- sets a new standard in fast-and-accurate inference of physical parameters of detected gravitational-wave events, which should enable real-time data analysis without sacrificing accuracy.
△ Less
Submitted 30 May, 2023; v1 submitted 23 June, 2021;
originally announced June 2021.
-
Physically constrained causal noise models for high-contrast imaging of exoplanets
Authors:
Timothy D. Gebhard,
Markus J. Bonse,
Sascha P. Quanz,
Bernhard Schölkopf
Abstract:
The detection of exoplanets in high-contrast imaging (HCI) data hinges on post-processing methods to remove spurious light from the host star. So far, existing methods for this task hardly utilize any of the available domain knowledge about the problem explicitly. We propose a new approach to HCI post-processing based on a modified half-sibling regression scheme, and show how we use this framework…
▽ More
The detection of exoplanets in high-contrast imaging (HCI) data hinges on post-processing methods to remove spurious light from the host star. So far, existing methods for this task hardly utilize any of the available domain knowledge about the problem explicitly. We propose a new approach to HCI post-processing based on a modified half-sibling regression scheme, and show how we use this framework to combine machine learning with existing scientific domain knowledge. On three real data sets, we demonstrate that the resulting system performs clearly better (both visually and in terms of the SNR) than one of the currently leading algorithms. If further studies can confirm these results, our method could have the potential to allow significant discoveries of exoplanets both in new and archival data.
△ Less
Submitted 9 December, 2020; v1 submitted 12 October, 2020;
originally announced October 2020.
-
Convolutional neural networks: a magic bullet for gravitational-wave detection?
Authors:
Timothy D. Gebhard,
Niki Kilbertus,
Ian Harry,
Bernhard Schölkopf
Abstract:
In the last few years, machine learning techniques, in particular convolutional neural networks, have been investigated as a method to replace or complement traditional matched filtering techniques that are used to detect the gravitational-wave signature of merging black holes. However, to date, these methods have not yet been successfully applied to the analysis of long stretches of data recorded…
▽ More
In the last few years, machine learning techniques, in particular convolutional neural networks, have been investigated as a method to replace or complement traditional matched filtering techniques that are used to detect the gravitational-wave signature of merging black holes. However, to date, these methods have not yet been successfully applied to the analysis of long stretches of data recorded by the Advanced LIGO and Virgo gravitational-wave observatories. In this work, we critically examine the use of convolutional neural networks as a tool to search for merging black holes. We identify the strengths and limitations of this approach, highlight some common pitfalls in translating between machine learning and gravitational-wave astronomy, and discuss the interdisciplinary challenges. In particular, we explain in detail why convolutional neural networks alone cannot be used to claim a statistically significant gravitational-wave detection. However, we demonstrate how they can still be used to rapidly flag the times of potential signals in the data for a more detailed follow-up. Our convolutional neural network architecture as well as the proposed performance metrics are better suited for this task than a standard binary classifications scheme. A detailed evaluation of our approach on Advanced LIGO data demonstrates the potential of such systems as trigger generators. Finally, we sound a note of caution by constructing adversarial examples, which showcase interesting "failure modes" of our model, where inputs with no visible resemblance to real gravitational-wave signals are identified as such by the network with high confidence.
△ Less
Submitted 6 September, 2019; v1 submitted 18 April, 2019;
originally announced April 2019.
-
A pixel-level model for event discovery in time-domain imaging
Authors:
Dun Wang,
David W. Hogg,
Daniel Foreman-Mackey,
Bernhard Schölkopf
Abstract:
Difference imaging or image subtraction is a method that measures differential photometry by matching the pointing and point-spread function (PSF) between image frames. It is used for the detection of time-variable phenomena. Here we present a new category of method---CPM Difference Imaging, in which differences are not measured between matched images but instead between image frames and a data-dr…
▽ More
Difference imaging or image subtraction is a method that measures differential photometry by matching the pointing and point-spread function (PSF) between image frames. It is used for the detection of time-variable phenomena. Here we present a new category of method---CPM Difference Imaging, in which differences are not measured between matched images but instead between image frames and a data-driven predictive model that has been designed only to predict the pointing, PSF, and detector effects but not astronomical variability. In CPM Difference Imaging each pixel is modelled by the Causal Pixel Model (CPM) originally built for modeling Kepler data, in which pixel values are predicted by a linear combination of other pixels at the same epoch but far enough away such that these pixels are causally disconnected, astrophysically. It does not require that the user have any explicit model or description of the pointing or point-spread function of any of the images. Its principal drawback is that---in its current form---it requires an imaging campaign with many epochs and fairly stable telescope pointing. The method is applied to simulated data and also the K2 Campaign 9 microlensing data. We show that CPM Difference Imaging can detect variable objects and produce precise differentiate photometry in a crowded field. CPM Difference Imaging is capable of producing image differences at nearly photon-noise precision.
△ Less
Submitted 9 October, 2017; v1 submitted 6 October, 2017;
originally announced October 2017.
-
The population of long-period transiting exoplanets
Authors:
Daniel Foreman-Mackey,
Timothy D. Morton,
David W. Hogg,
Eric Agol,
Bernhard Schölkopf
Abstract:
The Kepler Mission has discovered thousands of exoplanets and revolutionized our understanding of their population. This large, homogeneous catalog of discoveries has enabled rigorous studies of the occurrence rate of exoplanets and planetary systems as a function of their physical properties. However, transit surveys like Kepler are most sensitive to planets with orbital periods much shorter than…
▽ More
The Kepler Mission has discovered thousands of exoplanets and revolutionized our understanding of their population. This large, homogeneous catalog of discoveries has enabled rigorous studies of the occurrence rate of exoplanets and planetary systems as a function of their physical properties. However, transit surveys like Kepler are most sensitive to planets with orbital periods much shorter than the orbital periods of Jupiter and Saturn, the most massive planets in our Solar System. To address this deficiency, we perform a fully automated search for long-period exoplanets with only one or two transits in the archival Kepler light curves. When applied to the $\sim 40,000$ brightest Sun-like target stars, this search produces 16 long-period exoplanet candidates. Of these candidates, 6 are novel discoveries and 5 are in systems with inner short-period transiting planets. Since our method involves no human intervention, we empirically characterize the detection efficiency of our search. Based on these results, we measure the average occurrence rate of exoplanets smaller than Jupiter with orbital periods in the range 2-25 years to be $2.0\pm0.7$ planets per Sun-like star.
△ Less
Submitted 6 October, 2016; v1 submitted 27 July, 2016;
originally announced July 2016.
-
A Causal, Data-Driven Approach to Modeling the Kepler Data
Authors:
Dun Wang,
David W. Hogg,
Dan Foreman-Mackey,
Bernhard Schölkopf
Abstract:
Astronomical observations are affected by several kinds of noise, each with its own causal source; there is photon noise, stochastic source variability, and residuals coming from imperfect calibration of the detector or telescope. The precision of NASA Kepler photometry for exoplanet science---the most precise photometric measurements of stars ever made---appears to be limited by unknown or untrac…
▽ More
Astronomical observations are affected by several kinds of noise, each with its own causal source; there is photon noise, stochastic source variability, and residuals coming from imperfect calibration of the detector or telescope. The precision of NASA Kepler photometry for exoplanet science---the most precise photometric measurements of stars ever made---appears to be limited by unknown or untracked variations in spacecraft pointing and temperature, and unmodeled stellar variability. Here we present the Causal Pixel Model (CPM) for Kepler data, a data-driven model intended to capture variability but preserve transit signals. The CPM works at the pixel level so that it can capture very fine-grained information about the variation of the spacecraft. The CPM predicts each target pixel value from a large number of pixels of other stars sharing the instrument variabilities while not containing any information on possible transits in the target star. In addition, we use the target star's future and past (auto-regression). By appropriately separating, for each data point, the data into training and test sets, we ensure that information about any transit will be perfectly isolated from the model. The method has four hyper-parameters (the number of predictor stars, the auto-regressive window size, and two L2-regularization amplitudes for model components), which we set by cross-validation. We determine a generic set of hyper-parameters that works well for most of the stars and apply the method to a corresponding set of target stars. We find that we can consistently outperform (for the purposes of exoplanet detection) the Kepler Pre-search Data Conditioning (PDC) method for exoplanet discovery.
△ Less
Submitted 25 April, 2016; v1 submitted 8 August, 2015;
originally announced August 2015.
-
Removing systematic errors for exoplanet search via latent causes
Authors:
Bernhard Schölkopf,
David W. Hogg,
Dun Wang,
Daniel Foreman-Mackey,
Dominik Janzing,
Carl-Johann Simon-Gabriel,
Jonas Peters
Abstract:
We describe a method for removing the effect of confounders in order to reconstruct a latent quantity of interest. The method, referred to as half-sibling regression, is inspired by recent work in causal inference using additive noise models. We provide a theoretical justification and illustrate the potential of the method in a challenging astronomy application.
We describe a method for removing the effect of confounders in order to reconstruct a latent quantity of interest. The method, referred to as half-sibling regression, is inspired by recent work in causal inference using additive noise models. We provide a theoretical justification and illustrate the potential of the method in a challenging astronomy application.
△ Less
Submitted 12 May, 2015;
originally announced May 2015.
-
A systematic search for transiting planets in the K2 data
Authors:
Daniel Foreman-Mackey,
Benjamin T. Montet,
David W. Hogg,
Timothy D. Morton,
Dun Wang,
Bernhard Schölkopf
Abstract:
Photometry of stars from the K2 extension of NASA's Kepler mission is afflicted by systematic effects caused by small (few-pixel) drifts in the telescope pointing and other spacecraft issues. We present a method for searching K2 light curves for evidence of exoplanets by simultaneously fitting for these systematics and the transit signals of interest. This method is more computationally expensive…
▽ More
Photometry of stars from the K2 extension of NASA's Kepler mission is afflicted by systematic effects caused by small (few-pixel) drifts in the telescope pointing and other spacecraft issues. We present a method for searching K2 light curves for evidence of exoplanets by simultaneously fitting for these systematics and the transit signals of interest. This method is more computationally expensive than standard search algorithms but we demonstrate that it can be efficiently implemented and used to discover transit signals. We apply this method to the full Campaign 1 dataset and report a list of 36 planet candidates transiting 31 stars, along with an analysis of the pipeline performance and detection efficiency based on artificial signal injections and recoveries. For all planet candidates, we present posterior distributions on the properties of each system based strictly on the transit observables.
△ Less
Submitted 2 June, 2015; v1 submitted 16 February, 2015;
originally announced February 2015.
-
Towards building a Crowd-Sourced Sky Map
Authors:
Dustin Lang,
David W. Hogg,
Bernhard Scholkopf
Abstract:
We describe a system that builds a high dynamic-range and wide-angle image of the night sky by combining a large set of input images. The method makes use of pixel-rank information in the individual input images to improve a "consensus" pixel rank in the combined image. Because it only makes use of ranks and the complexity of the algorithm is linear in the number of images, the method is useful fo…
▽ More
We describe a system that builds a high dynamic-range and wide-angle image of the night sky by combining a large set of input images. The method makes use of pixel-rank information in the individual input images to improve a "consensus" pixel rank in the combined image. Because it only makes use of ranks and the complexity of the algorithm is linear in the number of images, the method is useful for large sets of uncalibrated images that might have undergone unknown non-linear tone mapping transformations for visualization or aesthetic reasons. We apply the method to images of the night sky (of unknown provenance) discovered on the Web. The method permits discovery of astronomical objects or features that are not visible in any of the input images taken individually. More importantly, however, it permits scientific exploitation of a huge source of astronomical images that would not be available to astronomical research without our automatic system.
△ Less
Submitted 5 June, 2014;
originally announced June 2014.
-
Maximizing Kepler science return per telemetered pixel: Searching the habitable zones of the brightest stars
Authors:
Benjamin T. Montet,
Ruth Angus,
Tom Barclay,
Rebekah Dawson,
Rob Fergus,
Dan Foreman-Mackey,
Stefan Harmeling,
Michael Hirsch,
David W. Hogg,
Dustin Lang,
David Schiminovich,
Bernhard Scholkopf
Abstract:
In today's mailing, Hogg et al. propose image modeling techniques to maintain 10-ppm-level precision photometry in Kepler data with only two working reaction wheels. While these results are relevant to many scientific goals for the repurposed mission, all modeling efforts so far have used a toy model of the Kepler telescope. Because the two-wheel performance of Kepler remains to be determined, we…
▽ More
In today's mailing, Hogg et al. propose image modeling techniques to maintain 10-ppm-level precision photometry in Kepler data with only two working reaction wheels. While these results are relevant to many scientific goals for the repurposed mission, all modeling efforts so far have used a toy model of the Kepler telescope. Because the two-wheel performance of Kepler remains to be determined, we advocate for the consideration of an alternate strategy for a >1 year program that maximizes the science return from the "low-torque" fields across the ecliptic plane. Assuming we can reach the precision of the original Kepler mission, we expect to detect 800 new planet candidates in the first year of such a mission. Our proposed strategy has benefits for transit timing variation and transit duration variation studies, especially when considered in concert with the future TESS mission. We also expect to help address the first key science goal of Kepler: the frequency of planets in the habitable zone as a function of spectral type.
△ Less
Submitted 3 September, 2013;
originally announced September 2013.
-
Maximizing Kepler science return per telemetered pixel: Detailed models of the focal plane in the two-wheel era
Authors:
David W. Hogg,
Ruth Angus,
Tom Barclay,
Rebekah Dawson,
Rob Fergus,
Dan Foreman-Mackey,
Stefan Harmeling,
Michael Hirsch,
Dustin Lang,
Benjamin T. Montet,
David Schiminovich,
Bernhard Schölkopf
Abstract:
Kepler's immense photometric precision to date was maintained through satellite stability and precise pointing. In this white paper, we argue that image modeling--fitting the Kepler-downlinked raw pixel data--can vastly improve the precision of Kepler in pointing-degraded two-wheel mode. We argue that a non-trivial modeling effort may permit continuance of photometry at 10-ppm-level precision. We…
▽ More
Kepler's immense photometric precision to date was maintained through satellite stability and precise pointing. In this white paper, we argue that image modeling--fitting the Kepler-downlinked raw pixel data--can vastly improve the precision of Kepler in pointing-degraded two-wheel mode. We argue that a non-trivial modeling effort may permit continuance of photometry at 10-ppm-level precision. We demonstrate some baby steps towards precise models in both data-driven (flexible) and physics-driven (interpretably parameterized) modes. We demonstrate that the expected drift or jitter in positions in the two-weel era will help with constraining calibration parameters. In particular, we show that we can infer the device flat-field at higher than pixel resolution; that is, we can infer pixel-to-pixel variations in intra-pixel sensitivity. These results are relevant to almost any scientific goal for the repurposed mission; image modeling ought to be a part of any two-wheel repurpose for the satellite. We make other recommendations for Kepler operations, but fundamentally advocate that the project stick with its core mission of finding and characterizing Earth analogs. [abridged]
△ Less
Submitted 3 September, 2013;
originally announced September 2013.
-
Gravitational Lensing Accuracy Testing 2010 (GREAT10) Challenge Handbook
Authors:
Thomas Kitching,
Sreekumar Balan,
Gary Bernstein,
Matthias Bethge,
Sarah Bridle,
Frederic Courbin,
Marc Gentile,
Alan Heavens,
Michael Hirsch,
Reshad Hosseini,
Alina Kiessling,
Adam Amara,
Donnacha Kirk,
Konrad Kuijken,
Rachel Mandelbaum,
Baback Moghaddam,
Guldariya Nurbaeva,
Stephane Paulin-Henriksson,
Anais Rassat,
Jason Rhodes,
Bernhard Schölkopf,
John Shawe-Taylor,
Mandeep Gill,
Marina Shmakova,
Andy Taylor
, et al. (10 additional authors not shown)
Abstract:
GRavitational lEnsing Accuracy Testing 2010 (GREAT10) is a public image analysis challenge aimed at the development of algorithms to analyze astronomical images. Specifically, the challenge is to measure varying image distortions in the presence of a variable convolution kernel, pixelization and noise. This is the second in a series of challenges set to the astronomy, computer science and statisti…
▽ More
GRavitational lEnsing Accuracy Testing 2010 (GREAT10) is a public image analysis challenge aimed at the development of algorithms to analyze astronomical images. Specifically, the challenge is to measure varying image distortions in the presence of a variable convolution kernel, pixelization and noise. This is the second in a series of challenges set to the astronomy, computer science and statistics communities, providing a structured environment in which methods can be improved and tested in preparation for planned astronomical surveys. GREAT10 extends upon previous work by introducing variable fields into the challenge. The "Galaxy Challenge" involves the precise measurement of galaxy shape distortions, quantified locally by two parameters called shear, in the presence of a known convolution kernel. Crucially, the convolution kernel and the simulated gravitational lensing shape distortion both now vary as a function of position within the images, as is the case for real data. In addition, we introduce the "Star Challenge" that concerns the reconstruction of a variable convolution kernel, similar to that in a typical astronomical observation. This document details the GREAT10 Challenge for potential participants. Continually updated information is also available from http://www.greatchallenges.info.
△ Less
Submitted 30 November, 2011; v1 submitted 3 September, 2010;
originally announced September 2010.
-
Results of the GREAT08 Challenge: An image analysis competition for cosmological lensing
Authors:
Sarah Bridle,
Sreekumar T. Balan,
Matthias Bethge,
Marc Gentile,
Stefan Harmeling,
Catherine Heymans,
Michael Hirsch,
Reshad Hosseini,
Mike Jarvis,
Donnacha Kirk,
Thomas Kitching,
Konrad Kuijken,
Antony Lewis,
Stephane Paulin-Henriksson,
Bernhard Scholkopf,
Malin Velander,
Lisa Voigt,
Dugan Witherick,
Adam Amara,
Gary Bernstein,
Frederic Courbin,
Mandeep Gill,
Alan Heavens,
Rachel Mandelbaum,
Richard Massey
, et al. (9 additional authors not shown)
Abstract:
We present the results of the GREAT08 Challenge, a blind analysis challenge to infer weak gravitational lensing shear distortions from images. The primary goal was to stimulate new ideas by presenting the problem to researchers outside the shear measurement community. Six GREAT08 Team methods were presented at the launch of the Challenge and five additional groups submitted results during the 6…
▽ More
We present the results of the GREAT08 Challenge, a blind analysis challenge to infer weak gravitational lensing shear distortions from images. The primary goal was to stimulate new ideas by presenting the problem to researchers outside the shear measurement community. Six GREAT08 Team methods were presented at the launch of the Challenge and five additional groups submitted results during the 6 month competition. Participants analyzed 30 million simulated galaxies with a range in signal to noise ratio, point-spread function ellipticity, galaxy size, and galaxy type. The large quantity of simulations allowed shear measurement methods to be assessed at a level of accuracy suitable for currently planned future cosmic shear observations for the first time. Different methods perform well in different parts of simulation parameter space and come close to the target level of accuracy in several of these. A number of fresh ideas have emerged as a result of the Challenge including a re-examination of the process of combining information from different galaxies, which reduces the dependence on realistic galaxy modelling. The image simulations will become increasingly sophisticated in future GREAT challenges, meanwhile the GREAT08 simulations remain as a benchmark for additional developments in shear measurement algorithms.
△ Less
Submitted 7 August, 2009;
originally announced August 2009.