subscribe to arXiv mailings

doi 10.3390/astronomy3010002

Beyond mirkwood: Enhancing SED Modeling with Conformal Predictions

Abstract: Traditional spectral energy distribution (SED) fitting techniques face uncertainties due to assumptions in star formation histories and dust attenuation curves. We propose an advanced machine learning-based approach that enhances flexibility and uncertainty quantification in SED fitting. Unlike the fixed NGBoost model used in mirkwood, our approach allows for any sklearn-compatible model, includin… ▽ More Traditional spectral energy distribution (SED) fitting techniques face uncertainties due to assumptions in star formation histories and dust attenuation curves. We propose an advanced machine learning-based approach that enhances flexibility and uncertainty quantification in SED fitting. Unlike the fixed NGBoost model used in mirkwood, our approach allows for any sklearn-compatible model, including deterministic models. We incorporate conformalized quantile regression to convert point predictions into error bars, enhancing interpretability and reliability. Using CatBoost as the base predictor, we compare results with and without conformal prediction, demonstrating improved performance using metrics such as coverage and interval width. Our method offers a more versatile and accurate tool for deriving galaxy physical properties from observational data. △ Less

Submitted 10 February, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

Comments: 6 pages + 1 reference page. Accepted to the 3rd AI2ASE workshop at AAAI 2024 (Vancouver, BC, Canada)

Journal ref: Astronomy 2024, 3, 14-20

arXiv:2311.03738 [pdf, other]

deep-REMAP: Parameterization of Stellar Spectra Using Regularized Multi-Task Learning

Authors: Sankalp Gilda

Abstract: Traditional spectral analysis methods are increasingly challenged by the exploding volumes of data produced by contemporary astronomical surveys. In response, we develop deep-Regularized Ensemble-based Multi-task Learning with Asymmetric Loss for Probabilistic Inference ($\rm{deep-REMAP}$), a novel framework that utilizes the rich synthetic spectra from the PHOENIX library and observational data f… ▽ More Traditional spectral analysis methods are increasingly challenged by the exploding volumes of data produced by contemporary astronomical surveys. In response, we develop deep-Regularized Ensemble-based Multi-task Learning with Asymmetric Loss for Probabilistic Inference ($\rm{deep-REMAP}$), a novel framework that utilizes the rich synthetic spectra from the PHOENIX library and observational data from the MARVELS survey to accurately predict stellar atmospheric parameters. By harnessing advanced machine learning techniques, including multi-task learning and an innovative asymmetric loss function, $\rm{deep-REMAP}$ demonstrates superior predictive capabilities in determining effective temperature, surface gravity, and metallicity from observed spectra. Our results reveal the framework's effectiveness in extending to other stellar libraries and properties, paving the way for more sophisticated and automated techniques in stellar characterization. △ Less

Submitted 21 November, 2023; v1 submitted 7 November, 2023; originally announced November 2023.

Comments: 5 main pages + 2 figures. Accepted to the ML4PS workshop at NeurIPS 2023

arXiv:2112.14072 [pdf, other]

doi 10.3390/astronomy3030012

Unsupervised Domain Adaptation for Constraining Star Formation Histories

Authors: Sankalp Gilda, Antoine de Mathelin, Sabine Bellstedt, Guillaume Richard

Abstract: The prevalent paradigm of machine learning today is to use past observations to predict future ones. What if, however, we are interested in knowing the past given the present? This situation is indeed one that astronomers must contend with often. To understand the formation of our universe, we must derive the time evolution of the visible mass content of galaxies. However, to observe a complete st… ▽ More The prevalent paradigm of machine learning today is to use past observations to predict future ones. What if, however, we are interested in knowing the past given the present? This situation is indeed one that astronomers must contend with often. To understand the formation of our universe, we must derive the time evolution of the visible mass content of galaxies. However, to observe a complete star life, one would need to wait for one billion years! To overcome this difficulty, astrophysicists leverage supercomputers and evolve simulated models of galaxies till the current age of the universe, thus establishing a mapping between observed radiation and star formation histories (SFHs). Such ground-truth SFHs are lacking for actual galaxy observations, where they are usually inferred -- with often poor confidence -- from spectral energy distributions (SEDs) using Bayesian fitting methods. In this investigation, we discuss the ability of unsupervised domain adaptation to derive accurate SFHs for galaxies with simulated data as a necessary first step in developing a technique that can ultimately be applied to observational data. △ Less

Submitted 26 August, 2022; v1 submitted 28 December, 2021; originally announced December 2021.

Comments: Accepted for oral presentation at the 1st Annual AAAI Workshop on AI to Accelerate Science and Engineering (AI2ASE). Journal article to follow

Journal ref: Astronomy 2024, 3(3), 189-207

arXiv:2107.00048 [pdf, other]

doi 10.1093/mnras/stab3243

Uncertainty-Aware Learning for Improvements in Image Quality of the Canada-France-Hawaii Telescope

Authors: Sankalp Gilda, Stark C. Draper, Sebastien Fabbro, William Mahoney, Simon Prunet, Kanoa Withington, Matthew Wilson, Yuan-Sen Ting, Andrew Sheinis

Abstract: We leverage state-of-the-art machine learning methods and a decade's worth of archival data from CFHT to predict observatory image quality (IQ) from environmental conditions and observatory operating parameters. Specifically, we develop accurate and interpretable models of the complex dependence between data features and observed IQ for CFHT's wide-field camera, MegaCam. Our contributions are seve… ▽ More We leverage state-of-the-art machine learning methods and a decade's worth of archival data from CFHT to predict observatory image quality (IQ) from environmental conditions and observatory operating parameters. Specifically, we develop accurate and interpretable models of the complex dependence between data features and observed IQ for CFHT's wide-field camera, MegaCam. Our contributions are several-fold. First, we collect, collate and reprocess several disparate data sets gathered by CFHT scientists. Second, we predict probability distribution functions (PDFs) of IQ and achieve a mean absolute error of $\sim0.07''$ for the predicted medians. Third, we explore the data-driven actuation of the 12 dome "vents" installed in 2013-14 to accelerate the flushing of hot air from the dome. We leverage epistemic and aleatoric uncertainties in conjunction with probabilistic generative modeling to identify candidate vent adjustments that are in-distribution (ID); for the optimal configuration for each ID sample, we predict the reduction in required observing time to achieve a fixed SNR. On average, the reduction is $\sim12\%$. Finally, we rank input features by their Shapley values to identify the most predictive variables for each observation. Our long-term goal is to construct reliable and real-time models that can forecast optimal observatory operating parameters to optimize IQ. We can then feed such forecasts into scheduling protocols and predictive maintenance routines. We anticipate that such approaches will become standard in automating observatory operations and maintenance by the time CFHT's successor, the Maunakea Spectroscopic Explorer, is installed in the next decade. △ Less

Submitted 15 November, 2021; v1 submitted 30 June, 2021; originally announced July 2021.

Comments: 25 pages and 12 figures (main) + 7 pages and 8 figures (2 appendices). Accepted for publication in MNRAS

arXiv:2101.04687 [pdf, other]

doi 10.3847/1538-4357/ac0058

{\sc mirkwood:} Fast and Accurate SED Modeling Using Machine Learning

Authors: Sankalp Gilda, Sidney Lower, Desika Narayanan

Abstract: Traditional spectral energy distribution (SED) fitting codes used to derive galaxy physical properties are often uncertain at the factor of a few level owing to uncertainties in galaxy star formation histories and dust attenuation curves. Beyond this, Bayesian fitting (which is typically used in SED fitting software) is an intrinsically compute-intensive task, often requiring access to expensive h… ▽ More Traditional spectral energy distribution (SED) fitting codes used to derive galaxy physical properties are often uncertain at the factor of a few level owing to uncertainties in galaxy star formation histories and dust attenuation curves. Beyond this, Bayesian fitting (which is typically used in SED fitting software) is an intrinsically compute-intensive task, often requiring access to expensive hardware for long periods of time. To overcome these shortcomings, we have developed {\sc mirkwood}: a user-friendly tool comprising of an ensemble of supervised machine learning-based models capable of non-linearly mapping galaxy fluxes to their properties. By stacking multiple models, we marginalize against any individual model's poor performance in a given region of the parameter space. We demonstrate \textsc{mirkwood}'s significantly improved performance over traditional techniques by training it on a combined data set of mock photometry of z=0 galaxies from the \textsc{Simba}, \textsc{EAGLE} and \textsc{IllustrisTNG} cosmological simulations, and comparing the derived results with those obtained from traditional SED fitting techniques. \textsc{mirkwood} is also able to account for uncertainties arising both from intrinsic noise in observations, and from finite training data and incorrect modeling assumptions. To increase the added value to the observational community, we use Shapley value explanations (SHAP) to fairly evaluate the relative importance of different bands to understand why particular predictions were reached. We envisage \textsc{mirkwood} to be an evolving, open-source framework that will provide highly accurate physical properties from observations of galaxies as compared to traditional SED fitting. △ Less

Submitted 12 January, 2021; originally announced January 2021.

Comments: 26 pages + 4 pages for appendix. Submitted to ApJ. Comments welcome

arXiv:2011.03132 [pdf, other]

Astronomical Image Quality Prediction based on Environmental and Telescope Operating Conditions

Authors: Sankalp Gilda, Yuan-Sen Ting, Kanoa Withington, Matthew Wilson, Simon Prunet, William Mahoney, Sebastien Fabbro, Stark C. Draper, Andrew Sheinis

Abstract: Intelligent scheduling of the sequence of scientific exposures taken at ground-based astronomical observatories is massively challenging. Observing time is over-subscribed and atmospheric conditions are constantly changing. We propose to guide observatory scheduling using machine learning. Leveraging a 15-year archive of exposures, environmental, and operating conditions logged by the Canada-Franc… ▽ More Intelligent scheduling of the sequence of scientific exposures taken at ground-based astronomical observatories is massively challenging. Observing time is over-subscribed and atmospheric conditions are constantly changing. We propose to guide observatory scheduling using machine learning. Leveraging a 15-year archive of exposures, environmental, and operating conditions logged by the Canada-France-Hawaii Telescope, we construct a probabilistic data-driven model that accurately predicts image quality. We demonstrate that, by optimizing the opening and closing of twelve vents placed on the dome of the telescope, we can reduce dome-induced turbulence and improve telescope image quality by (0.05-0.2 arc-seconds). This translates to a reduction in exposure time (and hence cost) of $\sim 10-15\%$. Our study is the first step toward data-based optimization of the multi-million dollar operations of current and next-generation telescopes. △ Less

Submitted 5 November, 2020; originally announced November 2020.

Comments: 4 pages, 3 figures. Accepted to Machine Learning and the Physical Sciences Workshop at the 34th Conference on Neural Information Processing Systems (NeurIPS)

arXiv:1907.05074 [pdf, other]

Gamma-ray Bursts as distance indicators through a machine learning approach

Authors: Maria Dainotti, Vahé Petrosian, Malgorzata Bogdan, Blazej Miasojedow, Shigehiro Nagataki, Trevor Hastie, Zooey Nuyngen, Sankalp Gilda, Xavier Hernandez, Dominika Krol

Abstract: Gamma-ray bursts (GRBs) are spectacularly energetic events, with the potential to inform on the early universe and its evolution, once their redshifts are known. Unfortunately, determining redshifts is a painstaking procedure requiring detailed follow-up multi-wavelength observations often involving various astronomical facilities, which have to be rapidly pointed at these serendipitous events. He… ▽ More Gamma-ray bursts (GRBs) are spectacularly energetic events, with the potential to inform on the early universe and its evolution, once their redshifts are known. Unfortunately, determining redshifts is a painstaking procedure requiring detailed follow-up multi-wavelength observations often involving various astronomical facilities, which have to be rapidly pointed at these serendipitous events. Here we use Machine Learning algorithms to infer redshifts from a collection of observed temporal and spectral features of GRBs. We obtained a very high correlation coefficient ($0.96$) between the inferred and the observed redshifts, and a small dispersion (with a mean square error of $0.003$) in the test set. The addition of plateau afterglow parameters improves the predictions by $61.4\%$ compared to previous results. The GRB luminosity function and cumulative density rate evolutions, obtained from predicted and observed redshift are in excellent agreement indicating that GRBs are effective distance indicators and a reliable step for the cosmic distance ladder. △ Less

Submitted 11 July, 2019; originally announced July 2019.

Comments: 22 pages, 5 figures to be submitted

arXiv:1903.05075 [pdf, other]

doi 10.1093/mnras/stz2577

Automatic Kalman-Filter-based Wavelet Shrinkage Denoising of 1D Stellar Spectra

Authors: Sankalp Gilda, Zachary Slepian

Abstract: We propose a non-parametric method to denoise 1D stellar spectra based on wavelet shrinkage followed by adaptive Kalman thresholding. Wavelet shrinkage denoising involves applying the Discrete Wavelet Transform (DWT) to the input signal, `shrinking' certain frequency components in the transform domain, and then applying inverse DWT to the reduced components. The performance of this procedure is in… ▽ More We propose a non-parametric method to denoise 1D stellar spectra based on wavelet shrinkage followed by adaptive Kalman thresholding. Wavelet shrinkage denoising involves applying the Discrete Wavelet Transform (DWT) to the input signal, `shrinking' certain frequency components in the transform domain, and then applying inverse DWT to the reduced components. The performance of this procedure is influenced by the choice of base wavelet, the number of decomposition levels, and the thresholding function. Typically, these parameters are chosen by `trial and error', which can be strongly dependent on the properties of the data being denoised. We here introduce an adaptive Kalman-filter-based thresholding method that eliminates the need for choosing the number of decomposition levels. We use the `Haar' wavelet basis, which we found to be the best-suited for 1D stellar spectra. We introduce various levels of Poisson noise into synthetic PHOENIX spectra, and test the performance of several common denoising methods against our own. It proves superior in terms of noise suppression and peak shape preservation. We expect it may also be of use in automatically and accurately filtering low signal-to-noise galaxy and quasar spectra obtained from surveys such as SDSS, Gaia, LSST, PESSTO, VANDELS, LEGA-C, and DESI. △ Less

Submitted 2 July, 2020; v1 submitted 12 March, 2019; originally announced March 2019.

Comments: 21 pages, 2 appendices, 17 figures. Published in MNRAS

Journal ref: Monthly Notices of the Royal Astronomical Society 490.4 (2019): 5249-5269

arXiv:1902.07215 [pdf, ps, other]

Feature Selection for Better Spectral Characterization or: How I Learned to Start Worrying and Love Ensembles

Authors: Sankalp Gilda

Abstract: An ever-looming threat to astronomical applications of machine learning is the danger of over-fitting data, also known as the `curse of dimensionality.' This occurs when there are fewer samples than the number of independent variables. In this work, we focus on the problem of stellar parameterization from low-mid resolution spectra, with blended absorption lines. We address this problem using an i… ▽ More An ever-looming threat to astronomical applications of machine learning is the danger of over-fitting data, also known as the `curse of dimensionality.' This occurs when there are fewer samples than the number of independent variables. In this work, we focus on the problem of stellar parameterization from low-mid resolution spectra, with blended absorption lines. We address this problem using an iterative algorithm to sequentially prune redundant features from synthetic PHOENIX spectra, and arrive at an optimal set of wavelengths with the strongest correlation with each of the output variables -- T$_{\rm eff}$, $\log g$, and [Fe/H]. We find that at any given resolution, most features (i.e., absorption lines) are not only redundant, but actually act as noise and decrease the accuracy of parameter retrieval. △ Less

Submitted 22 February, 2019; v1 submitted 19 February, 2019; originally announced February 2019.

Comments: 4 pages, 1 figure, presented at Astronomical Data Analysis Software & Systems (ADASS) 2018

Report number: eISBN: 978-1-58381-934-0

Journal ref: Astronomical Data Analysis Software and Systems XXVIII ASP Conference Series 523 (2018) 67

arXiv:1807.07098 [pdf, ps, other]

doi 10.1093/mnras/sty1933

The first super-Earth Detection from the High Cadence and High Radial Velocity Precision Dharma Planet Survey

Authors: Bo Ma, Jian Ge, Matthew Muterspaugh, Michael A. Singer, Gregory W. Henry, Jonay I. Gonzalez Hernandez, Sirinrat Sithajan, Sarik Jeram, Michael Williamson, Keivan Stassun, Benjamin Kimock, Frank Varosi, Sidney Schofield, Jian Liu, Scott Powell, Anthony Cassette, Hali Jakeman, Louis Avner, Nolan Grieves, Rory Barnes, Sankalp Gilda, Jim Grantham, Greg Stafford, David Savage, Steve Bland , et al. (1 additional authors not shown)

Abstract: The Dharma Planet Survey (DPS) aims to monitor about 150 nearby very bright FGKM dwarfs (within 50 pc) during 2016$-$2020 for low-mass planet detection and characterization using the TOU very high resolution optical spectrograph (R$\approx$100,000, 380-900nm). TOU was initially mounted to the 2-m Automatic Spectroscopic Telescope at Fairborn Observatory in 2013-2015 to conduct a pilot survey, then… ▽ More The Dharma Planet Survey (DPS) aims to monitor about 150 nearby very bright FGKM dwarfs (within 50 pc) during 2016$-$2020 for low-mass planet detection and characterization using the TOU very high resolution optical spectrograph (R$\approx$100,000, 380-900nm). TOU was initially mounted to the 2-m Automatic Spectroscopic Telescope at Fairborn Observatory in 2013-2015 to conduct a pilot survey, then moved to the dedicated 50-inch automatic telescope on Mt. Lemmon in 2016 to launch the survey. Here we report the first planet detection from DPS, a super-Earth candidate orbiting a bright K dwarf star, HD 26965. It is the second brightest star ($V=4.4$ mag) on the sky with a super-Earth candidate. The planet candidate has a mass of 8.47$\pm0.47M_{\rm Earth}$, period of $42.38\pm0.01$ d, and eccentricity of $0.04^{+0.05}_{-0.03}$. This RV signal was independently detected by Diaz et al. (2018), but they could not confirm if the signal is from a planet or from stellar activity. The orbital period of the planet is close to the rotation period of the star (39$-$44.5 d) measured from stellar activity indicators. Our high precision photometric campaign and line bisector analysis of this star do not find any significant variations at the orbital period. Stellar RV jitters modeled from star spots and convection inhibition are also not strong enough to explain the RV signal detected. After further comparing RV data from the star's active magnetic phase and quiet magnetic phase, we conclude that the RV signal is due to planetary-reflex motion and not stellar activity. △ Less

Submitted 18 July, 2018; originally announced July 2018.

Comments: 13 pages, 17 figures, Accepted for publication in MNRAS

Showing 1–10 of 10 results for author: Gilda, S