subscribe to arXiv mailings

Radio Galaxy Zoo: tagging radio subjects using text

Authors: Dawei Chen, Vinay Kerai, Matthew J. Alger, O. Ivy Wong, Cheng Soon Ong

Abstract: RadioTalk is a communication platform that enabled members of the Radio Galaxy Zoo (RGZ) citizen science project to engage in discussion threads and provide further descriptions of the radio subjects they were observing in the form of tags and comments. It contains a wealth of auxiliary information which is useful for the morphology identification of complex and extended radio sources. In this pap… ▽ More RadioTalk is a communication platform that enabled members of the Radio Galaxy Zoo (RGZ) citizen science project to engage in discussion threads and provide further descriptions of the radio subjects they were observing in the form of tags and comments. It contains a wealth of auxiliary information which is useful for the morphology identification of complex and extended radio sources. In this paper, we present this new dataset, and for the first time in radio astronomy, we combine text and images to automatically classify radio galaxies using a multi-modal learning approach. We found incorporating text features improved classification performance which demonstrates that text annotations are rare but valuable sources of information for classifying astronomical sources, and suggests the importance of exploiting multi-modal information in future citizen science projects. We also discovered over 10,000 new radio sources beyond the RGZ-DR1 catalogue in this dataset. △ Less

Submitted 12 October, 2023; originally announced October 2023.

Comments: 14 pages, 9 figures, accepted for publication in PASA

arXiv:2102.10903 [pdf, other]

doi 10.1017/pasa.2021.10

Interpretable Faraday Complexity Classification

Authors: M. J. Alger, J. D. Livingston, N. M. McClure-Griffiths, J. L. Nabaglo, O. I. Wong, C. S. Ong

Abstract: Faraday complexity describes whether a spectropolarimetric observation has simple or complex magnetic structure. Quickly determining the Faraday complexity of a spectropolarimetric observation is important for processing large, polarised radio surveys. Finding simple sources lets us build rotation measure grids, and finding complex sources lets us follow these sources up with slower analysis techn… ▽ More Faraday complexity describes whether a spectropolarimetric observation has simple or complex magnetic structure. Quickly determining the Faraday complexity of a spectropolarimetric observation is important for processing large, polarised radio surveys. Finding simple sources lets us build rotation measure grids, and finding complex sources lets us follow these sources up with slower analysis techniques or further observations. We introduce five features that can be used to train simple, interpretable machine learning classifiers for estimating Faraday complexity. We train logistic regression and extreme gradient boosted tree classifiers on simulated polarised spectra using our features, analyse their behaviour, and demonstrate our features are effective for both simulated and real data. This is the first application of machine learning methods to real spectropolarimetry data. With 95 per cent accuracy on simulated ASKAP data and 90 per cent accuracy on simulated ATCA data, our method performs comparably to state-of-the-art convolutional neural networks while being simpler and easier to interpret. Logistic regression trained with our features behaves sensibly on real data and its outputs are useful for sorting polarised sources by apparent Faraday complexity. △ Less

Submitted 22 February, 2021; originally announced February 2021.

Comments: Accepted for publication in PASA

Journal ref: Publ. Astron. Soc. Aust. 38 (2021) e022

arXiv:2102.01139 [pdf, other]

doi 10.1093/mnras/stab253

Heightened Faraday Complexity in the inner 1 kpc of the Galactic Centre

Authors: J. D. Livingston, N. M. McClure-Griffiths, B. M. Gaensler, A. Seta, M. J. Alger

Abstract: We have measured the Faraday rotation of 62 extra-galactic background sources in 58 fields using the CSIRO Australia Telescope Compact Array (ATCA) with a frequency range of 1.1 - 3.1 GHz with 2048 channels. Our sources cover a region $\sim 12\, \mathrm{deg}\, \times 12\,\mathrm{deg}$ ($\sim 1 $kpc) around the Galactic Centre region. We show that the Galactic Plane for $|l| < 10^\circ$ exhibits la… ▽ More We have measured the Faraday rotation of 62 extra-galactic background sources in 58 fields using the CSIRO Australia Telescope Compact Array (ATCA) with a frequency range of 1.1 - 3.1 GHz with 2048 channels. Our sources cover a region $\sim 12\, \mathrm{deg}\, \times 12\,\mathrm{deg}$ ($\sim 1 $kpc) around the Galactic Centre region. We show that the Galactic Plane for $|l| < 10^\circ$ exhibits large Rotation Measures (RMs) with a maximum |RM| of $1691.2 \pm 4.9\, \mathrm{rad}\,\mathrm{m}^{-2}$ and a mean $|\mathrm{RM}| = 219 \pm 42\,\mathrm{rad}\,\mathrm{m}^{-2}$. The RMs decrease in magnitude with increasing projected distance from the Galactic Plane, broadly consistent with previous findings. We find an unusually high fraction (95\%) of the sources show Faraday complexity consistent with multiple Faraday components. We attribute the presences of multiple Faraday rotating screens with widely separated Faraday depths to small-scale turbulent RM structure in the Galactic Centre region. The second order structure function of the RM in the Galactic Centre displays a line with a gradient of zero for angular separations spanning $0.83^\circ - 11^\circ$ ($\sim 120 - 1500$ pc), which is expected for scales larger than the outer scale (or driving scale) of magneto-ionic turbulence. We place an upper limit on any break in the SF gradient of 66'', corresponding to an inferred upper limit to the outer scale of turbulence in the inner 1 kpc of the Galactic Centre of $3$ pc. We propose stellar feedback as the probable driver of this small-scale turbulence. △ Less

Submitted 1 February, 2021; originally announced February 2021.

Comments: 15+66 pages, 11+65 figures, 3 tables. In press with MNRAS

arXiv:1906.02864 [pdf, other]

doi 10.1088/1538-3873/ab213d

Radio Galaxy Zoo: Unsupervised Clustering of Convolutionally Auto-encoded Radio-astronomical Images

Authors: Nicholas O. Ralph, Ray P. Norris, Gu Fang, Laurence A. F. Park, Timothy J. Galvin, Matthew J. Alger, Heinz Andernach, Chris Lintott, Lawrence Rudnick, Stanislav Shabala, O. Ivy Wong

Abstract: This paper demonstrates a novel and efficient unsupervised clustering method with the combination of a Self-Organising Map (SOM) and a convolutional autoencoder. The rapidly increasing volume of radio-astronomical data has increased demand for machine learning methods as solutions to classification and outlier detection. Major astronomical discoveries are unplanned and found in the unexpected, mak… ▽ More This paper demonstrates a novel and efficient unsupervised clustering method with the combination of a Self-Organising Map (SOM) and a convolutional autoencoder. The rapidly increasing volume of radio-astronomical data has increased demand for machine learning methods as solutions to classification and outlier detection. Major astronomical discoveries are unplanned and found in the unexpected, making unsupervised machine learning highly desirable by operating without assumptions and labelled training data. Our approach shows SOM training time is drastically reduced and high-level features can be clustered by training on auto-encoded feature vectors instead of raw images. Our results demonstrate this method is capable of accurately separating outliers on a SOM with neighbourhood similarity and K-means clustering of radio-astronomical features complexity. We present this method as a powerful new approach to data exploration by providing a detailed understanding of the morphology and relationships of Radio Galaxy Zoo (RGZ) dataset image features which can be applied to new radio survey data. △ Less

Submitted 6 June, 2019; originally announced June 2019.

Comments: Accepted in Publications of the Astronomical Society of the Pacific, special issue on Machine Intelligence in Astronomy and Astrophysics. 23 pages, 8 full-page colour figures

arXiv:1904.02876 [pdf, other]

doi 10.1088/1538-3873/ab150b

Radio Galaxy Zoo: Knowledge Transfer Using Rotationally Invariant Self-Organising Maps

Authors: T. J. Galvin, M. Huynh, R. P. Norris, X. R. Wang, E. Hopkins, O. I. Wong, S. Shabala, L. Rudnick, M. J. Alger, K. L. Polsterer

Abstract: With the advent of large scale surveys the manual analysis and classification of individual radio source morphologies is rendered impossible as existing approaches do not scale. The analysis of complex morphological features in the spatial domain is a particularly important task. Here we discuss the challenges of transferring crowdsourced labels obtained from the Radio Galaxy Zoo project and intro… ▽ More With the advent of large scale surveys the manual analysis and classification of individual radio source morphologies is rendered impossible as existing approaches do not scale. The analysis of complex morphological features in the spatial domain is a particularly important task. Here we discuss the challenges of transferring crowdsourced labels obtained from the Radio Galaxy Zoo project and introduce a proper transfer mechanism via quantile random forest regression. By using parallelized rotation and flipping invariant Kohonen-maps, image cubes of Radio Galaxy Zoo selected galaxies formed from the FIRST radio continuum and WISE infrared all sky surveys are first projected down to a two-dimensional embedding in an unsupervised way. This embedding can be seen as a discretised space of shapes with the coordinates reflecting morphological features as expressed by the automatically derived prototypes. We find that these prototypes have reconstructed physically meaningful processes across two channel images at radio and infrared wavelengths in an unsupervised manner. In the second step, images are compared with those prototypes to create a heat-map, which is the morphological fingerprint of each object and the basis for transferring the user generated labels. These heat-maps have reduced the feature space by a factor of 248 and are able to be used as the basis for subsequent ML methods. Using an ensemble of decision trees we achieve upwards of 85.7% and 80.7% accuracy when predicting the number of components and peaks in an image, respectively, using these heat-maps. We also question the currently used discrete classification schema and introduce a continuous scale that better reflects the uncertainty in transition between two classes, caused by sensitivity and resolution limits. △ Less

Submitted 5 April, 2019; originally announced April 2019.

arXiv:1805.12008 [pdf, other]

doi 10.1093/mnras/sty2646

Radio Galaxy Zoo: ClaRAN - A Deep Learning Classifier for Radio Morphologies

Authors: Chen Wu, O. Ivy Wong, Lawrence Rudnick, Stanislav S. Shabala, Matthew J. Alger, Julie K. Banfield, Cheng Soon Ong, Sarah V. White, Avery F. Garon, Ray P. Norris, Heinz Andernach, Jean Tate, Vesna Lukic, Hongming Tang, Kevin Schawinski, Foivos I. Diakogiannis

Abstract: The upcoming next-generation large area radio continuum surveys can expect tens of millions of radio sources, rendering the traditional method for radio morphology classification through visual inspection unfeasible. We present ClaRAN - Classifying Radio sources Automatically with Neural networks - a proof-of-concept radio source morphology classifier based upon the Faster Region-based Convolution… ▽ More The upcoming next-generation large area radio continuum surveys can expect tens of millions of radio sources, rendering the traditional method for radio morphology classification through visual inspection unfeasible. We present ClaRAN - Classifying Radio sources Automatically with Neural networks - a proof-of-concept radio source morphology classifier based upon the Faster Region-based Convolutional Neutral Networks (Faster R-CNN) method. Specifically, we train and test ClaRAN on the FIRST and WISE images from the Radio Galaxy Zoo Data Release 1 catalogue. ClaRAN provides end users with automated identification of radio source morphology classifications from a simple input of a radio image and a counterpart infrared image of the same region. ClaRAN is the first open-source, end-to-end radio source morphology classifier that is capable of locating and associating discrete and extended components of radio sources in a fast (< 200 milliseconds per image) and accurate (>= 90 %) fashion. Future work will improve ClaRAN's relatively lower success rates in dealing with multi-source fields and will enable ClaRAN to identify sources on much larger fields without loss in classification accuracy. △ Less

Submitted 29 October, 2018; v1 submitted 30 May, 2018; originally announced May 2018.

Comments: 22 pages, 16 figures, Accepted in Monthly Notices of the Royal Astronomical Society

arXiv:1805.05540 [pdf, other]

doi 10.1093/mnras/sty1308

Radio Galaxy Zoo: Machine learning for radio source host galaxy cross-identification

Authors: M. J. Alger, J. K. Banfield, C. S. Ong, L. Rudnick, O. I. Wong, C. Wolf, H. Andernach, R. P. Norris, S. S. Shabala

Abstract: We consider the problem of determining the host galaxies of radio sources by cross-identification. This has traditionally been done manually, which will be intractable for wide-area radio surveys like the Evolutionary Map of the Universe (EMU). Automated cross-identification will be critical for these future surveys, and machine learning may provide the tools to develop such methods. We apply a st… ▽ More We consider the problem of determining the host galaxies of radio sources by cross-identification. This has traditionally been done manually, which will be intractable for wide-area radio surveys like the Evolutionary Map of the Universe (EMU). Automated cross-identification will be critical for these future surveys, and machine learning may provide the tools to develop such methods. We apply a standard approach from computer vision to cross-identification, introducing one possible way of automating this problem, and explore the pros and cons of this approach. We apply our method to the 1.4 GHz Australian Telescope Large Area Survey (ATLAS) observations of the Chandra Deep Field South (CDFS) and the ESO Large Area ISO Survey South 1 (ELAIS-S1) fields by cross-identifying them with the Spitzer Wide-area Infrared Extragalactic (SWIRE) survey. We train our method with two sets of data: expert cross-identifications of CDFS from the initial ATLAS data release and crowdsourced cross-identifications of CDFS from Radio Galaxy Zoo. We found that a simple strategy of cross-identifying a radio component with the nearest galaxy performs comparably to our more complex methods, though our estimated best-case performance is near 100 per cent. ATLAS contains 87 complex radio sources that have been cross-identified by experts, so there are not enough complex examples to learn how to cross-identify them accurately. Much larger datasets are therefore required for training methods like ours. We also show that training our method on Radio Galaxy Zoo cross-identifications gives comparable results to training on expert cross-identifications, demonstrating the value of crowdsourced training data. △ Less

Submitted 14 May, 2018; originally announced May 2018.

Showing 1–7 of 7 results for author: Alger, M J