-
Self-supervised contrastive learning of radio data for source detection, classification and peculiar object discovery
Authors:
S. Riggi,
T. Cecconello,
S. Palazzo,
A. M. Hopkins,
N. Gupta,
C. Bordiu,
A. Ingallinera,
C. Buemi,
F. Bufano,
F. Cavallaro,
M. D. Filipović,
P. Leto,
S. Loru,
A. C. Ruggeri,
C. Trigilio,
G. Umana,
F. Vitello
Abstract:
New advancements in radio data post-processing are underway within the SKA precursor community, aiming to facilitate the extraction of scientific results from survey images through a semi-automated approach. Several of these developments leverage deep learning (DL) methodologies for diverse tasks, including source detection, object or morphology classification, and anomaly detection. Despite subst…
▽ More
New advancements in radio data post-processing are underway within the SKA precursor community, aiming to facilitate the extraction of scientific results from survey images through a semi-automated approach. Several of these developments leverage deep learning (DL) methodologies for diverse tasks, including source detection, object or morphology classification, and anomaly detection. Despite substantial progress, the full potential of these methods often remains untapped due to challenges associated with training large supervised models, particularly in the presence of small and class-unbalanced labelled datasets. Self-supervised learning has recently established itself as a powerful methodology to deal with some of the aforementioned challenges, by directly learning a lower-dimensional representation from large samples of unlabelled data. The resulting model and data representation can then be used for data inspection and various downstream tasks if a small subset of labelled data is available. In this work, we explored contrastive learning methods to learn suitable radio data representation from unlabelled images taken from the ASKAP EMU and SARAO MeerKAT GPS surveys. We evaluated trained models and the obtained data representation over smaller labelled datasets, also taken from different radio surveys, in selected analysis tasks: source detection and classification, and search for objects with peculiar morphology. For all explored downstream tasks, we reported and discussed the benefits brought by self-supervised foundational models built on radio data.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Classification of compact radio sources in the Galactic plane with supervised machine learning
Authors:
S. Riggi,
G. Umana,
C. Trigilio,
C. Bordiu,
F. Bufano,
A. Ingallinera,
F. Cavallaro,
Y. Gordon,
R. P. Norris,
G. Gürkan,
P. Leto,
C. Buemi,
S. Loru,
A. M. Hopkins,
M. D. Filipović,
T. Cecconello
Abstract:
Generation of science-ready data from processed data products is one of the major challenges in next-generation radio continuum surveys with the Square Kilometre Array (SKA) and its precursors, due to the expected data volume and the need to achieve a high degree of automated processing. Source extraction, characterization, and classification are the major stages involved in this process. In this…
▽ More
Generation of science-ready data from processed data products is one of the major challenges in next-generation radio continuum surveys with the Square Kilometre Array (SKA) and its precursors, due to the expected data volume and the need to achieve a high degree of automated processing. Source extraction, characterization, and classification are the major stages involved in this process. In this work we focus on the classification of compact radio sources in the Galactic plane using both radio and infrared images as inputs. To this aim, we produced a curated dataset of ~20,000 images of compact sources of different astronomical classes, obtained from past radio and infrared surveys, and novel radio data from pilot surveys carried out with the Australian SKA Pathfinder (ASKAP). Radio spectral index information was also obtained for a subset of the data. We then trained two different classifiers on the produced dataset. The first model uses gradient-boosted decision trees and is trained on a set of pre-computed features derived from the data, which include radio-infrared colour indices and the radio spectral index. The second model is trained directly on multi-channel images, employing convolutional neural networks. Using a completely supervised procedure, we obtained a high classification accuracy (F1-score>90%) for separating Galactic objects from the extragalactic background. Individual class discrimination performances, ranging from 60% to 75%, increased by 10% when adding far-infrared and spectral index information, with extragalactic objects, PNe and HII regions identified with higher accuracies. The implemented tools and trained models were publicly released, and made available to the radioastronomical community for future application on new radio data.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
RADiff: Controllable Diffusion Models for Radio Astronomical Maps Generation
Authors:
Renato Sortino,
Thomas Cecconello,
Andrea DeMarco,
Giuseppe Fiameni,
Andrea Pilzer,
Andrew M. Hopkins,
Daniel Magro,
Simone Riggi,
Eva Sciacca,
Adriano Ingallinera,
Cristobal Bordiu,
Filomena Bufano,
Concetto Spampinato
Abstract:
Along with the nearing completion of the Square Kilometre Array (SKA), comes an increasing demand for accurate and reliable automated solutions to extract valuable information from the vast amount of data it will allow acquiring. Automated source finding is a particularly important task in this context, as it enables the detection and classification of astronomical objects. Deep-learning-based obj…
▽ More
Along with the nearing completion of the Square Kilometre Array (SKA), comes an increasing demand for accurate and reliable automated solutions to extract valuable information from the vast amount of data it will allow acquiring. Automated source finding is a particularly important task in this context, as it enables the detection and classification of astronomical objects. Deep-learning-based object detection and semantic segmentation models have proven to be suitable for this purpose. However, training such deep networks requires a high volume of labeled data, which is not trivial to obtain in the context of radio astronomy. Since data needs to be manually labeled by experts, this process is not scalable to large dataset sizes, limiting the possibilities of leveraging deep networks to address several tasks. In this work, we propose RADiff, a generative approach based on conditional diffusion models trained over an annotated radio dataset to generate synthetic images, containing radio sources of different morphologies, to augment existing datasets and reduce the problems caused by class imbalances. We also show that it is possible to generate fully-synthetic image-annotation pairs to automatically augment any annotated dataset. We evaluate the effectiveness of this approach by training a semantic segmentation model on a real dataset augmented in two ways: 1) using synthetic images obtained from real masks, and 2) generating images from synthetic semantic masks. We show an improvement in performance when applying augmentation, gaining up to 18% in performance when using real masks and 4% when augmenting with synthetic masks. Finally, we employ this model to generate large-scale radio maps with the objective of simulating Data Challenges.
△ Less
Submitted 5 July, 2023;
originally announced July 2023.
-
Radio source analysis services for the SKA and precursors
Authors:
Simone Riggi,
Cristobal Bordiu,
Daniel Magro,
Renato Sortino,
Carmelo Pino,
Eva Sciacca,
Filomena Bufano,
Thomas Cecconello,
Giuseppe Vizzari,
Fabio Vitello,
Giuseppe Tudisco
Abstract:
New developments in data processing and visualization are being made in preparation for upcoming radioastronomical surveys planned with the Square Kilometre Array (SKA) and its precursors. A major goal is enabling extraction of science information from the data in a mostly automated way, possibly exploiting the capabilities offered by modern computing infrastructures and technologies. In this cont…
▽ More
New developments in data processing and visualization are being made in preparation for upcoming radioastronomical surveys planned with the Square Kilometre Array (SKA) and its precursors. A major goal is enabling extraction of science information from the data in a mostly automated way, possibly exploiting the capabilities offered by modern computing infrastructures and technologies. In this context, the integration of source analysis algorithms into data visualization tools is expected to significantly improve and speed up the cataloguing process of large area surveys. To this aim, the CIRASA (Collaborative and Integrated platform for Radio Astronomical Source Analysis) project was recently started to develop and integrate a set of services for source extraction, classification and analysis into the ViaLactea visual analytic platform and knowledge base archive. In this contribution, we will present the project objectives and tools that have been developed, interfaced and deployed so far on the prototype European Open Science Cloud (EOSC) infrastructure provided by the H2020 NEANIAS project.
△ Less
Submitted 7 January, 2023;
originally announced January 2023.
-
Astronomical source detection in radio continuum maps with deep neural networks
Authors:
S. Riggi,
D. Magro,
R. Sortino,
A. De Marco,
C. Bordiu,
T. Cecconello,
A. M. Hopkins,
J. Marvil,
G. Umana,
E. Sciacca,
F. Vitello,
F. Bufano,
A. Ingallinera,
G. Fiameni,
C. Spampinato,
K. Zarb Adami
Abstract:
Source finding is one of the most challenging tasks in upcoming radio continuum surveys with SKA precursors, such as the Evolutionary Map of the Universe (EMU) survey of the Australian SKA Pathfinder (ASKAP) telescope. The resolution, sensitivity, and sky coverage of such surveys is unprecedented, requiring new features and improvements to be made in existing source finders. Among them, reducing t…
▽ More
Source finding is one of the most challenging tasks in upcoming radio continuum surveys with SKA precursors, such as the Evolutionary Map of the Universe (EMU) survey of the Australian SKA Pathfinder (ASKAP) telescope. The resolution, sensitivity, and sky coverage of such surveys is unprecedented, requiring new features and improvements to be made in existing source finders. Among them, reducing the false detection rate, particularly in the Galactic plane, and the ability to associate multiple disjoint islands into physical objects. To bridge this gap, we developed a new source finder, based on the Mask R-CNN object detection framework, capable of both detecting and classifying compact, extended, spurious, and poorly imaged sources in radio continuum images. The model was trained using ASKAP EMU data, observed during the Early Science and pilot survey phase, and previous radio survey data, taken with the VLA and ATCA telescopes. On the test sample, the final model achieves an overall detection completeness above 85\%, a reliability of $\sim$65\%, and a classification precision/recall above 90\%. Results obtained for all source classes are reported and discussed.
△ Less
Submitted 5 December, 2022;
originally announced December 2022.
-
Latent Space Explorer: Unsupervised Data Pattern Discovery on the Cloud
Authors:
T. Cecconello,
C. Bordiu,
F. Bufano,
L. Puerari,
S. Riggi,
E. Schisano,
E. Sciacca,
Y. Maruccia,
G. Vizzari
Abstract:
Extracting information from raw data is probably one of the central activities of experimental scientific enterprises. This work is about a pipeline in which a specific model is trained to provide a compact, essential representation of the training data, useful as a starting point for visualization and analyses aimed at detecting patterns, regularities among data. To enable researchers exploiting…
▽ More
Extracting information from raw data is probably one of the central activities of experimental scientific enterprises. This work is about a pipeline in which a specific model is trained to provide a compact, essential representation of the training data, useful as a starting point for visualization and analyses aimed at detecting patterns, regularities among data. To enable researchers exploiting this approach, a cloud-based system is being developed and tested in the NEANIAS project as one of the ML-tools of a thematic service to be offered to the EOSC. Here, we describe the architecture of the system and introduce two example use cases in the astronomical context.
△ Less
Submitted 29 April, 2022;
originally announced April 2022.
-
Novel EOSC Services for Space Challenges: The NEANIAS First Outcomes
Authors:
Eva Sciacca,
Mel Krokos,
Ugo Becciani,
Cristobal Bordiu,
Filomena Bufano,
Alessandro Costa,
Carmelo Pino,
Simone Riggi,
Fabio Vitello,
Carlos Brandt,
Angelo Rossi,
Eugenio Topa,
Simone Mantovani,
Laura Vettorello,
Thomas Cecconello,
Giuseppe Vizzari
Abstract:
The European Open Science Cloud (EOSC) initiative faces the challenge of developing an agile, fit-for-purpose, and sustainable service-oriented platform that can address the evolving needs of scientific communities. The NEANIAS project plays an active role in the materialization of the EOSC ecosystem by actively contributing to the technological, procedural, strategic and business development of E…
▽ More
The European Open Science Cloud (EOSC) initiative faces the challenge of developing an agile, fit-for-purpose, and sustainable service-oriented platform that can address the evolving needs of scientific communities. The NEANIAS project plays an active role in the materialization of the EOSC ecosystem by actively contributing to the technological, procedural, strategic and business development of EOSC. We present the first outcomes of the NEANIAS activities relating to co-design and delivery of new innovative services for space research for data management and visualization (SPACE-VIS), map making and mosaicing (SPACE-MOS) and pattern and structure detection (SPACE-ML). We include a summary of collected user requirements driving our services and methodology for their delivery, together with service access details and pointers to future works.
△ Less
Submitted 19 January, 2021;
originally announced January 2021.