-
Rotation and flipping invariant self-organizing maps with astronomical images: A cookbook and application to the VLA Sky Survey QuickLook images
Authors:
A. N. Vantyghem,
T. J. Galvin,
B. Sebastian,
C. P. O'Dea,
Y. A. Gordon,
M. Boyce,
L. Rudnick,
K. Polsterer,
Heinz Andernach,
M. Dionyssiou,
P. Venkataraman,
R. Norris,
S. A. Baum,
X. R. Wang,
M. Huynh
Abstract:
Modern wide field radio surveys typically detect millions of objects. Techniques based on machine learning are proving to be useful for classifying large numbers of objects. The self-organizing map (SOM) is an unsupervised machine learning algorithm that projects a many-dimensional dataset onto a two- or three-dimensional lattice of neurons. This dimensionality reduction allows the user to visuali…
▽ More
Modern wide field radio surveys typically detect millions of objects. Techniques based on machine learning are proving to be useful for classifying large numbers of objects. The self-organizing map (SOM) is an unsupervised machine learning algorithm that projects a many-dimensional dataset onto a two- or three-dimensional lattice of neurons. This dimensionality reduction allows the user to visualize common features of the data better and develop algorithms for classifying objects that are not otherwise possible with large datasets. To this aim, we use the PINK implementation of a SOM. PINK incorporates rotation and flipping invariance so that the SOM algorithm may be applied to astronomical images. In this cookbook we provide instructions for working with PINK, including preprocessing the input images, training the model, and offering lessons learned through experimentation. The problem of imbalanced classes can be improved by careful selection of the training sample and increasing the number of neurons in the SOM (chosen by the user). Because PINK is not scale-invariant, structure can be smeared in the neurons. This can also be improved by increasing the number of neurons in the SOM. We also introduce pyink, a Python package used to read and write PINK binary files, assist in common preprocessing operations, perform standard analyses, visualize the SOM and preprocessed images, and create image-based annotations using a graphical interface. A tutorial is also provided to guide the user through the entire process. We present an application of PINK to VLA Sky Survey (VLASS) images. We demonstrate that the PINK is generally able to group VLASS sources with similar morphology together. We use the results of PINK to estimate the probability that a given source in the VLASS QuickLook Catalogue is actually due to sidelobe contamination.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Hydra II: Characterisation of Aegean, Caesar, ProFound, PyBDSF, and Selavy source finders
Authors:
M. M. Boyce,
A. M. Hopkins,
S. Riggi,
L. Rudnick,
M. Ramsay,
C. L. Hale,
J. Marvil,
M. Whiting,
P. Venkataraman,
C. P. O'Dea,
S. A. Baum,
Y. A. Gordon,
A. N. Vantyghem,
M. Dionyssiou,
H. Andernach,
J. D. Collier,
J. English,
B. S. Koribalski,
D. Leahy,
M. J. Michałowski,
S. Safi-Harb,
M. Vaccari,
E. Alexander,
M. Cowley,
A. D. Kapinska
, et al. (2 additional authors not shown)
Abstract:
We present a comparison between the performance of a selection of source finders using a new software tool called Hydra. The companion paper, Paper~I, introduced the Hydra tool and demonstrated its performance using simulated data. Here we apply Hydra to assess the performance of different source finders by analysing real observational data taken from the Evolutionary Map of the Universe (EMU) Pil…
▽ More
We present a comparison between the performance of a selection of source finders using a new software tool called Hydra. The companion paper, Paper~I, introduced the Hydra tool and demonstrated its performance using simulated data. Here we apply Hydra to assess the performance of different source finders by analysing real observational data taken from the Evolutionary Map of the Universe (EMU) Pilot Survey. EMU is a wide-field radio continuum survey whose primary goal is to make a deep ($20μ$Jy/beam RMS noise), intermediate angular resolution ($15^{\prime\prime}$), 1\,GHz survey of the entire sky south of $+30^{\circ}$ declination, and expecting to detect and catalogue up to 40 million sources. With the main EMU survey expected to begin in 2022 it is highly desirable to understand the performance of radio image source finder software and to identify an approach that optimises source detection capabilities. Hydra has been developed to refine this process, as well as to deliver a range of metrics and source finding data products from multiple source finders. We present the performance of the five source finders tested here in terms of their completeness and reliability statistics, their flux density and source size measurements, and an exploration of case studies to highlight finder-specific limitations.
△ Less
Submitted 27 April, 2023;
originally announced April 2023.
-
Hydra I: An extensible multi-source-finder comparison and cataloguing tool
Authors:
M. M. Boyce,
A. M. Hopkins,
S. Riggi,
L. Rudnick,
M. Ramsay,
C. L. Hale,
J. Marvil,
M. Whiting,
P. Venkataraman,
C. P. O'Dea,
S. A. Baum,
Y. A. Gordon,
A. N. Vantyghem,
M. Dionyssiou,
H. Andernach,
J. D. Collier,
J. English,
B. S. Koribalski,
D. Leahy,
M. J. Michałowski,
S. Safi-Harb,
M. Vaccari,
E. Alexander,
M. Cowley,
A. D. Kapinska
, et al. (2 additional authors not shown)
Abstract:
The latest generation of radio surveys are now producing sky survey images containing many millions of radio sources. In this context it is highly desirable to understand the performance of radio image source finder (SF) software and to identify an approach that optimises source detection capabilities. We have created Hydra to be an extensible multi-SF and cataloguing tool that can be used to comp…
▽ More
The latest generation of radio surveys are now producing sky survey images containing many millions of radio sources. In this context it is highly desirable to understand the performance of radio image source finder (SF) software and to identify an approach that optimises source detection capabilities. We have created Hydra to be an extensible multi-SF and cataloguing tool that can be used to compare and evaluate different SFs. Hydra, which currently includes the SFs Aegean, Caesar, ProFound, PyBDSF, and Selavy, provides for the addition of new SFs through containerisation and configuration files. The SF input RMS noise and island parameters are optimised to a 90\% ''percentage real detections'' threshold (calculated from the difference between detections in the real and inverted images), to enable comparison between SFs. Hydra provides completeness and reliability diagnostics through observed-deep ($\mathcal{D}$) and generated-shallow ($\mathcal{S}$) images, as well as other statistics. In addition, it has a visual inspection tool for comparing residual images through various selection filters, such as S/N bins in completeness or reliability. The tool allows the user to easily compare and evaluate different SFs in order to choose their desired SF, or a combination thereof. This paper is part one of a two part series. In this paper we introduce the Hydra software suite and validate its $\mathcal{D/S}$ metrics using simulated data. The companion paper demonstrates the utility of Hydra by comparing the performance of SFs using both simulated and real images.
△ Less
Submitted 27 April, 2023;
originally announced April 2023.
-
A Quick Look at the $3\,$GHz Radio Sky I. Source Statistics from the Very Large Array Sky Survey
Authors:
Yjan A. Gordon,
Michelle M. Boyce,
Christopher P. O'Dea,
Lawrence Rudnick,
Heinz Andernach,
Adrian N. Vantyghem,
Stefi A. Baum,
Jean-Paul Bui,
Mathew Dionyssiou,
Samar Safi-Harb,
Isabel Sander
Abstract:
The Very Large Array Sky Survey (VLASS) is observing the entire sky north of $-40^{\circ}$ in the S-band ($2<ν<4\,$GHz), with the highest angular resolution ($2''.5$) of any all-sky radio continuum survey to date. VLASS will cover its entire footprint over three distinct epochs, the first of which has now been observed in full. Based on Quick Look images from this first epoch, we have created a ca…
▽ More
The Very Large Array Sky Survey (VLASS) is observing the entire sky north of $-40^{\circ}$ in the S-band ($2<ν<4\,$GHz), with the highest angular resolution ($2''.5$) of any all-sky radio continuum survey to date. VLASS will cover its entire footprint over three distinct epochs, the first of which has now been observed in full. Based on Quick Look images from this first epoch, we have created a catalog of $1.9\times10^{6}$ reliably detected radio components. Due to the limitations of the Quick Look images, component flux densities are underestimated by $\sim 15\,\%$ at $S_{\text{peak}}>3\,$mJy/beam and are often unreliable for fainter components. We use this catalog to perform statistical analyses of the $ν\sim 3\,$GHz radio sky. Comparisons with the Faint Images of the Radio Sky at Twenty cm survey (FIRST) show the typical $1.4-3\,$GHz spectral index, $α$, to be $\sim-0.71$. The radio color-color distribution of point and extended components is explored by matching with FIRST and the LOFAR Two Meter Sky Survey. We present the VLASS source counts, $dN/dS$, which are found to be consistent with previous observations at $1.4$ and $3\,$GHz. Resolution improvements over FIRST result in excess power in the VLASS two-point correlation function at angular scales $\lesssim 7''$, and in $18\,\%$ of active galactic nuclei associated with a single FIRST component being split into multi-component sources by VLASS.
△ Less
Submitted 25 May, 2021; v1 submitted 23 February, 2021;
originally announced February 2021.