-
Searching for changing-state AGNs in massive datasets -- I: applying deep learning and anomaly detection techniques to find AGNs with anomalous variability behaviours
Authors:
P. Sánchez-Sáez,
H. Lira,
L. Martí,
N. Sánchez-Pi,
J. Arredondo,
F. E. Bauer,
A. Bayo,
G. Cabrera-Vives,
C. Donoso-Oliva,
P. A. Estévez,
S. Eyheramendy,
F. Förster,
L. Hernández-García,
A. M. Muñoz Arancibia,
M. Pérez-Carrasco,
M. Sepúlveda,
J. R. Vergara
Abstract:
The classic classification scheme for Active Galactic Nuclei (AGNs) was recently challenged by the discovery of the so-called changing-state (changing-look) AGNs (CSAGNs). The physical mechanism behind this phenomenon is still a matter of open debate and the samples are too small and of serendipitous nature to provide robust answers. In order to tackle this problem, we need to design methods that…
▽ More
The classic classification scheme for Active Galactic Nuclei (AGNs) was recently challenged by the discovery of the so-called changing-state (changing-look) AGNs (CSAGNs). The physical mechanism behind this phenomenon is still a matter of open debate and the samples are too small and of serendipitous nature to provide robust answers. In order to tackle this problem, we need to design methods that are able to detect AGN right in the act of changing-state. Here we present an anomaly detection (AD) technique designed to identify AGN light curves with anomalous behaviors in massive datasets. The main aim of this technique is to identify CSAGN at different stages of the transition, but it can also be used for more general purposes, such as cleaning massive datasets for AGN variability analyses. We used light curves from the Zwicky Transient Facility data release 5 (ZTF DR5), containing a sample of 230,451 AGNs of different classes. The ZTF DR5 light curves were modeled with a Variational Recurrent Autoencoder (VRAE) architecture, that allowed us to obtain a set of attributes from the VRAE latent space that describes the general behaviour of our sample. These attributes were then used as features for an Isolation Forest (IF) algorithm, that is an anomaly detector for a "one class" kind of problem. We used the VRAE reconstruction errors and the IF anomaly score to select a sample of 8,809 anomalies. These anomalies are dominated by bogus candidates, but we were able to identify 75 promising CSAGN candidates.
△ Less
Submitted 12 July, 2021; v1 submitted 14 June, 2021;
originally announced June 2021.
-
The Automatic Learning for the Rapid Classification of Events (ALeRCE) Alert Broker
Authors:
F. Förster,
G. Cabrera-Vives,
E. Castillo-Navarrete,
P. A. Estévez,
P. Sánchez-Sáez,
J. Arredondo,
F. E. Bauer,
R. Carrasco-Davis,
M. Catelan,
F. Elorrieta,
S. Eyheramendy,
P. Huijse,
G. Pignata,
E. Reyes,
I. Reyes,
D. Rodríguez-Mancini,
D. Ruz-Mieres,
C. Valenzuela,
I. Alvarez-Maldonado,
N. Astorga,
J. Borissova,
A. Clocchiatti,
D. De Cicco,
C. Donoso-Oliva,
M. J. Graham
, et al. (15 additional authors not shown)
Abstract:
We introduce the Automatic Learning for the Rapid Classification of Events (ALeRCE) broker, an astronomical alert broker designed to provide a rapid and self--consistent classification of large etendue telescope alert streams, such as that provided by the Zwicky Transient Facility (ZTF) and, in the future, the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST). ALeRCE is a Chilean--l…
▽ More
We introduce the Automatic Learning for the Rapid Classification of Events (ALeRCE) broker, an astronomical alert broker designed to provide a rapid and self--consistent classification of large etendue telescope alert streams, such as that provided by the Zwicky Transient Facility (ZTF) and, in the future, the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST). ALeRCE is a Chilean--led broker run by an interdisciplinary team of astronomers and engineers, working to become intermediaries between survey and follow--up facilities. ALeRCE uses a pipeline which includes the real--time ingestion, aggregation, cross--matching, machine learning (ML) classification, and visualization of the ZTF alert stream. We use two classifiers: a stamp--based classifier, designed for rapid classification, and a light--curve--based classifier, which uses the multi--band flux evolution to achieve a more refined classification. We describe in detail our pipeline, data products, tools and services, which are made public for the community (see \url{https://alerce.science}). Since we began operating our real--time ML classification of the ZTF alert stream in early 2019, we have grown a large community of active users around the globe. We describe our results to date, including the real--time processing of $9.7\times10^7$ alerts, the stamp classification of $1.9\times10^7$ objects, the light curve classification of $8.5\times10^5$ objects, the report of 3088 supernova candidates, and different experiments using LSST-like alert streams. Finally, we discuss the challenges ahead to go from a single-stream of alerts such as ZTF to a multi--stream ecosystem dominated by LSST.
△ Less
Submitted 7 August, 2020;
originally announced August 2020.
-
Dimensionality Reduction of SDSS Spectra with Variational Autoencoders
Authors:
Stephen K. N. Portillo,
John K. Parejko,
Jorge R. Vergara,
Andrew J. Connolly
Abstract:
High resolution galaxy spectra contain much information about galactic physics, but the high dimensionality of these spectra makes it difficult to fully utilize the information they contain. We apply variational autoencoders (VAEs), a non-linear dimensionality reduction technique, to a sample of spectra from the Sloan Digital Sky Survey. In contrast to Principal Component Analysis (PCA), a widely…
▽ More
High resolution galaxy spectra contain much information about galactic physics, but the high dimensionality of these spectra makes it difficult to fully utilize the information they contain. We apply variational autoencoders (VAEs), a non-linear dimensionality reduction technique, to a sample of spectra from the Sloan Digital Sky Survey. In contrast to Principal Component Analysis (PCA), a widely used technique, VAEs can capture non-linear relationships between latent parameters and the data. We find that a VAE can reconstruct the SDSS spectra well with only six latent parameters, outperforming PCA with the same number of components. Different galaxy classes are naturally separated in this latent space, without class labels having been given to the VAE. The VAE latent space is interpretable because the VAE can be used to make synthetic spectra at any point in latent space. For example, making synthetic spectra along tracks in latent space yields sequences of realistic spectra that interpolate between two different types of galaxies. Using the latent space to find outliers may yield interesting spectra: in our small sample, we immediately find unusual data artifacts and stars misclassified as galaxies. In this exploratory work, we show that VAEs create compact, interpretable latent spaces that capture non-linear features of the data. While a VAE takes substantial time to train (~1 day for 48000 spectra), once trained, VAEs can enable the fast exploration of large astronomical data sets.
△ Less
Submitted 9 July, 2020; v1 submitted 24 February, 2020;
originally announced February 2020.