-
Selection of powerful radio galaxies with machine learning
Authors:
R. Carvajal,
I. Matute,
J. Afonso,
R. P. Norris,
K. J. Luken,
P. Sánchez-Sáez,
P. A. C. Cunha,
A. Humphrey,
H. Messias,
S. Amarantidis,
D. Barbosa,
H. A. Cruz,
H. Miranda,
A. Paulino-Afonso,
C. Pappalardo
Abstract:
We developed and trained a pipeline of three machine learning (ML) models than can predict which sources are more likely to be an AGN and to be detected in specific radio surveys. Also, it can estimate redshift values for predicted radio-detectable AGNs. These models, which combine predictions from tree-based and gradient-boosting algorithms, have been trained with multi-wavelength data from near-…
▽ More
We developed and trained a pipeline of three machine learning (ML) models than can predict which sources are more likely to be an AGN and to be detected in specific radio surveys. Also, it can estimate redshift values for predicted radio-detectable AGNs. These models, which combine predictions from tree-based and gradient-boosting algorithms, have been trained with multi-wavelength data from near-infrared-selected sources in the Hobby-Eberly Telescope Dark Energy Experiment (HETDEX) Spring field. Training, testing, calibration, and validation were carried out in the HETDEX field. Further validation was performed on near-infrared-selected sources in the Stripe 82 field. In the HETDEX validation subset, our pipeline recovers 96% of the initially labelled AGNs and, from AGNs candidates, we recover 50% of previously detected radio sources. For Stripe 82, these numbers are 94% and 55%. Compared to random selection, these rates are two and four times better for HETDEX, and 1.2 and 12 times better for Stripe 82. The pipeline can also recover the redshift distribution of these sources with $σ_{\mathrm{NMAD}}$ = 0.07 for HETDEX ($σ_{\mathrm{NMAD}}$ = 0.09 for Stripe 82) and an outlier fraction of 19% (25% for Stripe 82), compatible with previous results based on broad-band photometry. Feature importance analysis stresses the relevance of near- and mid-infrared colours to select AGNs and identify their radio and redshift nature. Combining different algorithms in ML models shows an improvement in the prediction power of our pipeline over a random selection of sources. Tree-based ML models (in contrast to deep learning techniques) facilitate the analysis of the impact that features have on the predictions. This prediction can give insight into the potential physical interplay between the properties of radio AGNs (e.g. mass of black hole and accretion rate).
△ Less
Submitted 1 December, 2023; v1 submitted 20 September, 2023;
originally announced September 2023.
-
Measuring photometric redshifts for high-redshift radio source surveys
Authors:
Kieran J. Luken,
Ray P. Norris,
X. Rosalind Wang,
Laurence A. F. Park,
Ying Guo,
Miroslav D. Filipovic
Abstract:
With the advent of deep, all-sky radio surveys, the need for ancillary data to make the most of the new, high-quality radio data from surveys like the Evolutionary Map of the Universe (EMU), GLEAM-X, VLASS and LoTSS is growing rapidly. Radio surveys produce significant numbers of Active Galactic Nuclei (AGNs), and have a significantly higher average redshift when compared with optical and infrared…
▽ More
With the advent of deep, all-sky radio surveys, the need for ancillary data to make the most of the new, high-quality radio data from surveys like the Evolutionary Map of the Universe (EMU), GLEAM-X, VLASS and LoTSS is growing rapidly. Radio surveys produce significant numbers of Active Galactic Nuclei (AGNs), and have a significantly higher average redshift when compared with optical and infrared all-sky surveys. Thus, traditional methods of estimating redshift are challenged, with spectroscopic surveys not reaching the redshift depth of radio surveys, and AGNs making it difficult for template fitting methods to accurately model the source. Machine Learning (ML) methods have been used, but efforts have typically been directed towards optically selected samples, or samples at significantly lower redshift than expected from upcoming radio surveys. This work compiles and homogenises a radio-selected dataset from both the northern hemisphere (making use of SDSS optical photometry), and southern hemisphere (making use of Dark Energy Survey optical photometry). We then test commonly used ML algorithms such as k-Nearest Neighbours (kNN), Random Forest, ANNz and GPz on this monolithic radio-selected sample. We show that kNN has the lowest percentage of catastrophic outliers, providing the best match for the majority of science cases in the EMU survey. We note that the wider redshift range of the combined dataset used allows for estimation of sources up to z = 3 before random scatter begins to dominate. When binning the data into redshift bins and treating the problem as a classification problem, we are able to correctly identify $\approx$76% of the highest redshift sources - sources at redshift z $>$ 2.51 - as being in either the highest bin (z $>$ 2.51), or second highest (z = 2.25).
△ Less
Submitted 12 July, 2023;
originally announced July 2023.
-
Estimating Galaxy Redshift in Radio-Selected Datasets using Machine Learning
Authors:
Kieran J. Luken,
Ray P. Norris,
Laurence A. F. Park,
X. Rosalind Wang,
Miroslav D. Filipovic
Abstract:
All-sky radio surveys are set to revolutionise the field with new discoveries. However, the vast majority of the tens of millions of radio galaxies won't have the spectroscopic redshift measurements required for a large number of science cases. Here, we evaluate techniques for estimating redshifts of galaxies from a radio-selected survey. Using a radio-selected sample with broadband photometry at…
▽ More
All-sky radio surveys are set to revolutionise the field with new discoveries. However, the vast majority of the tens of millions of radio galaxies won't have the spectroscopic redshift measurements required for a large number of science cases. Here, we evaluate techniques for estimating redshifts of galaxies from a radio-selected survey. Using a radio-selected sample with broadband photometry at infrared and optical wavelengths, we test the k-Nearest Neighbours (kNN) and Random Forest machine learning algorithms, testing them both in their regression and classification modes. Further, we test different distance metrics used by the kNN algorithm, including the standard Euclidean distance, the Mahalanobis distance and a learned distance metric for both the regression mode (the Metric Learning for Kernel Regression metric) and the classification mode (the Large Margin Nearest Neighbour metric). We find that all regression-based modes fail on galaxies at a redshift $z > 1$. However, below this range, the kNN algorithm using the Mahalanobis distance metric performs best, with an $η_{0.15}$ outlier rate of 5.85\%. In the classification mode, the kNN algorithm using the Mahalanobis distance metric also performs best, with an $η_{0.15}$ outlier rate of 5.85\%, correctly placing 74\% of galaxies in the top $z > 1.02$ bin. Finally, we also tested the effect of training in one field and applying the trained algorithm to similar data from another field and found that variation across fields does not result in statistically significant differences in predicted redshifts. Importantly, we find that while we may not be able to predict a continuous value for high-redshift radio sources, we can identify the majority of them using the classification modes of existing techniques.
△ Less
Submitted 27 February, 2022;
originally announced February 2022.
-
Mysterious Odd Radio Circle near the Large Magellanic Cloud -- An Intergalactic Supernova Remnant?
Authors:
Miroslav D. Filipović,
J. L. Payne,
R. Z. E. Alsaberi,
R. P. Norris,
P. J. Macgregor,
L. Rudnick,
B. S. Koribalski,
D. Leahy,
L. Ducci,
R. Kothes,
H. Andernach,
L. Barnes,
I. S. Bojičić,
L. M. Bozzetto,
R. Brose,
J. D. Collier,
E. J. Crawford,
R. M. Crocker,
S. Dai,
T. J. Galvin,
F. Haberl,
U. Heber,
T. Hill,
A. M. Hopkins,
N. Hurley-Walker
, et al. (26 additional authors not shown)
Abstract:
We report the discovery of J0624-6948, a low-surface brightness radio ring, lying between the Galactic Plane and the Large Magellanic Cloud (LMC). It was first detected at 888 MHz with the Australian Square Kilometre Array Pathfinder (ASKAP), and with a diameter of ~196 arcsec. This source has phenomenological similarities to Odd Radio Circles (ORCs). Significant differences to the known ORCs - a…
▽ More
We report the discovery of J0624-6948, a low-surface brightness radio ring, lying between the Galactic Plane and the Large Magellanic Cloud (LMC). It was first detected at 888 MHz with the Australian Square Kilometre Array Pathfinder (ASKAP), and with a diameter of ~196 arcsec. This source has phenomenological similarities to Odd Radio Circles (ORCs). Significant differences to the known ORCs - a flatter radio spectral index, the lack of a prominent central galaxy as a possible host, and larger apparent size - suggest that J0624-6948 may be a different type of object. We argue that the most plausible explanation for J0624-6948 is an intergalactic supernova remnant due to a star that resided in the LMC outskirts that had undergone a single-degenerate type Ia supernova, and we are seeing its remnant expand into a rarefied, intergalactic environment. We also examine if a massive star or a white dwarf binary ejected from either galaxy could be the supernova progenitor. Finally, we consider several other hypotheses for the nature of the object, including the jets of an active galactic nucleus (AGN) or the remnant of a nearby stellar super-flare.
△ Less
Submitted 24 January, 2022;
originally announced January 2022.
-
Missing Data Imputation for Galaxy Redshift Estimation
Authors:
Kieran J. Luken,
Rabina Padhy,
X. Rosalind Wang
Abstract:
Astronomical data is full of holes. While there are many reasons for this missing data, the data can be randomly missing, caused by things like data corruptions or unfavourable observing conditions. We test some simple data imputation methods(Mean, Median, Minimum, Maximum and k-Nearest Neighbours (kNN)), as well as two more complex methods (Multivariate Imputation by using Chained Equation (MICE)…
▽ More
Astronomical data is full of holes. While there are many reasons for this missing data, the data can be randomly missing, caused by things like data corruptions or unfavourable observing conditions. We test some simple data imputation methods(Mean, Median, Minimum, Maximum and k-Nearest Neighbours (kNN)), as well as two more complex methods (Multivariate Imputation by using Chained Equation (MICE) and Generative Adversarial Imputation Network (GAIN)) against data where increasing amounts are randomly set to missing. We then use the imputed datasets to estimate the redshift of the galaxies, using the kNN and Random Forest ML techniques. We find that the MICE algorithm provides the lowest Root Mean Square Error and consequently the lowest prediction error, with the GAIN algorithm the next best.
△ Less
Submitted 26 November, 2021;
originally announced November 2021.
-
Radio Observations of Supernova Remnant G1.9+0.3
Authors:
Kieran J. Luken,
Miroslav D. Filipović,
Nigel I. Maxted,
Roland Kothes,
Ray P. Norris,
James R. Allison,
Rebecca Blackwell,
Catherine Braiding,
Robert Brose,
Michael Burton,
Ain Y. De Horta,
Tim J. Galvin,
Lisa Harvey-Smith,
Natasha Hurley-Walker,
Denis Leahy,
Nicholas O. Ralph,
Quentin Roper,
Gavin Rowell,
Iurii Sushch,
Dejan Urošević,
Graeme F. Wong
Abstract:
We present 1 to 10GHz radio continuum flux density, spectral index, polarisation and Rotation Measure (RM) images of the youngest known Galactic Supernova Remnant (SNR) G1.9+0.3, using observations from the Australia Telescope Compact Array (ATCA). We have conducted an expansion study spanning 8 epochs between 1984 and 2017, yielding results consistent with previous expansion studies of G1.9+0.3.…
▽ More
We present 1 to 10GHz radio continuum flux density, spectral index, polarisation and Rotation Measure (RM) images of the youngest known Galactic Supernova Remnant (SNR) G1.9+0.3, using observations from the Australia Telescope Compact Array (ATCA). We have conducted an expansion study spanning 8 epochs between 1984 and 2017, yielding results consistent with previous expansion studies of G1.9+0.3. We find a mean radio continuum expansion rate of ($0.78 \pm 0.09$) per cent year$^{-1}$ (or $\sim8900$ km s$^{-1}$ at an assumed distance of 8.5 kpc), although the expansion rate varies across the SNR perimeter. In the case of the most recent epoch between 2016 and 2017, we observe faster-than-expected expansion of the northern region. We find a global spectral index for G1.9+0.3 of $-0.81\pm0.02$ (76 MHz$-$10 GHz). Towards the northern region, however, the radio spectrum is observed to steepen significantly ($\sim -$1). Towards the two so called (east & west) "ears" of G1.9+0.3, we find very different RM values of 400-600 rad m$^{2}$ and 100-200 rad m$^{2}$ respectively. The fractional polarisation of the radio continuum emission reaches (19 $\pm$ 2)~per~cent, consistent with other, slightly older, SNRs such as Cas~A.
△ Less
Submitted 3 December, 2019;
originally announced December 2019.
-
Non-thermal emission from the reverse shock of the youngest galactic Supernova remnant G1.9+0.3
Authors:
R. Brose,
I. Sushch,
M. Pohl,
K. J. Luken,
M. D. Filipovic,
R. Lin
Abstract:
Context. The youngest Galactic supernova remnant G1.9+0.3 is an interesting target for next generation gamma-ray observatories. So far, the remnant is only detected in the radio and the X-ray bands, but its young age of ~100 yrs and inferred shock speed of ~14,000 km/s could make it an efficient particle accelerator. Aims. We aim to model the observed radio and X-ray spectra together with the morp…
▽ More
Context. The youngest Galactic supernova remnant G1.9+0.3 is an interesting target for next generation gamma-ray observatories. So far, the remnant is only detected in the radio and the X-ray bands, but its young age of ~100 yrs and inferred shock speed of ~14,000 km/s could make it an efficient particle accelerator. Aims. We aim to model the observed radio and X-ray spectra together with the morphology of the remnant. At the same time, we aim to estimate the gamma-ray flux from the source and evaluated the prospects of its detection with future gamma-ray experiments. Methods. We performed spherical symmetric 1-D simulations with the RATPaC code, in which we simultaneously solve the transport equation for cosmic rays, the transport equation for magnetic turbulence, and the hydro-dynamical equations for the gas flow. Separately computed distributions of the particles accelerated at the forward and the reverse shock are then used to calculate the spectra of synchrotron, inverse Compton, and pion-decay radiation from the source. Results. The emission from G1.9+0.3 can be self-consistently explained within the test-particle limit. We find that the X-ray flux is dominated by emission from the forward shock while most of the radio emission originates near the reverse shock, which makes G1.9+0.3 the first remnant with non-thermal radiation detected from the reverse shock. The flux of very-high-energy gamma-ray emission from G1.9+0.3 is expected to be close to the sensitivity threshold of the Cherenkov Telescope Array, CTA. The limited time available to grow large-scale turbulence limits the maximum energy of particles to values below 100 TeV, hence G1.9+0.3 is not a PeVatron.
△ Less
Submitted 6 June, 2019;
originally announced June 2019.
-
Discovery of a Pulsar-powered Bow Shock Nebula in the Small Magellanic Cloud Supernova Remnant DEMS5
Authors:
Rami Z. E. Alsaberi,
C. Maitra,
M. D. Filipovi'c,
L. M. Bozzetto,
F. Haberl,
P. Maggi,
M. Sasaki,
P. Manjolovi'c,
V. Velovi'c,
P. Kavanagh,
N. I. Maxted,
D. Urovsevi'c,
G. P. Rowell,
G. F. Wong,
B. -Q. For,
A. N. O'Brien,
T. J. Galvin,
L. Staveley-Smith,
R. P. Norris,
T. Jarrett,
R. Kothes,
K. J. Luken,
N. Hurley-Walker,
H. Sano,
D. Oni'c
, et al. (10 additional authors not shown)
Abstract:
We report the discovery of a new Small Magellanic Cloud Pulsar Wind Nebula (PWN) at the edge of the Supernova Remnant (SNR)-DEM S5. The pulsar powered object has a cometary morphology similar to the Galactic PWN analogs PSR B1951+32 and 'the mouse'. It is travelling supersonically through the interstellar medium. We estimate the Pulsar kick velocity to be in the range of 700-2000 km/s for an age b…
▽ More
We report the discovery of a new Small Magellanic Cloud Pulsar Wind Nebula (PWN) at the edge of the Supernova Remnant (SNR)-DEM S5. The pulsar powered object has a cometary morphology similar to the Galactic PWN analogs PSR B1951+32 and 'the mouse'. It is travelling supersonically through the interstellar medium. We estimate the Pulsar kick velocity to be in the range of 700-2000 km/s for an age between 28-10 kyr. The radio spectral index for this SNR PWN pulsar system is flat (-0.29 $\pm$ 0.01) consistent with other similar objects. We infer that the putative pulsar has a radio spectral index of -1.8, which is typical for Galactic pulsars. We searched for dispersion measures (DMs) up to 1000 cm/pc^3 but found no convincing candidates with a S/N greater than 8. We produce a polarisation map for this PWN at 5500 MHz and find a mean fractional polarisation of P $\sim 23$ percent. The X-ray power-law spectrum (Gamma $\sim 2$) is indicative of non-thermal synchrotron emission as is expected from PWN-pulsar system. Finally, we detect DEM S5 in Infrared (IR) bands. Our IR photometric measurements strongly indicate the presence of shocked gas which is expected for SNRs. However, it is unusual to detect such IR emission in a SNR with a supersonic bow-shock PWN. We also find a low-velocity HI cloud of $\sim 107$ km/s which is possibly interacting with DEM S5. SNR DEM S5 is the first confirmed detection of a pulsar-powered bow shock nebula found outside the Galaxy.
△ Less
Submitted 11 April, 2019; v1 submitted 7 March, 2019;
originally announced March 2019.
-
Preliminary results of using k-Nearest Neighbours Regression to estimate the redshift of radio selected datasets
Authors:
Kieran J. Luken,
Ray P. Norris,
Laurence A. F. Park
Abstract:
In the near future, all-sky radio surveys are set to produce catalogues of tens of millions of sources with limited multi-wavelength photometry. Spectroscopic redshifts will only be possible for a small fraction of these new-found sources. In this paper, we provide the first in-depth investigation into the use of k-Nearest Neighbours Regression for the estimation of redshift of these sources. We u…
▽ More
In the near future, all-sky radio surveys are set to produce catalogues of tens of millions of sources with limited multi-wavelength photometry. Spectroscopic redshifts will only be possible for a small fraction of these new-found sources. In this paper, we provide the first in-depth investigation into the use of k-Nearest Neighbours Regression for the estimation of redshift of these sources. We use the Australia Telescope Large Area Survey radio data, combined with the Spitzer Wide-Area Infrared Extragalactic Survey infra-red, the Dark Energy Survey optical and the Australian Dark Energy Survey spectroscopic survey data. We then reduce the depth of photometry to match what is expected from upcoming Evolutionary Map of the Universe survey, testing against both data sets. To examine the generalisation of our methods, we test one of the sub-fields of Australia Telescope Large Area Survey against the other. We achieve an outlier rate of ~10% across all tests, showing that the k-Nearest Neighbours regression algorithm is an acceptable method of estimating redshift, and would perform better given a sample training set with uniform redshift coverage.
△ Less
Submitted 25 October, 2018;
originally announced October 2018.