Open Access
Issue
A&A
Volume 672, April 2023
Article Number A111
Number of page(s) 22
Section Numerical methods and codes
DOI https://doi.org/10.1051/0004-6361/202245172
Published online 10 April 2023

© The Authors 2023

Licence Creative CommonsOpen Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.

1 Introduction

The advent of modern astronomical surveys, initiated by the Sloan Digital Sky Survey (SDSS, Blanton et al. 2017) and further propelled by the Zwicky Transient Facility (ZTF, Bellm et al. 2019), has popularised the use of automated machine learning methods (Baron 2019). This shift towards a data-driven approach to astronomical research has been developing swiftly for supervised learning tasks in the areas of classification (see e.g. Carleo et al. 2019; Ishida 2019; Malik et al. 2022, and references therein) and regression (e.g. Krone-Martins et al. 2014; Pasquet et al. 2019; Cabayol et al. 2021; Henghes et al. 2021; Chen et al. 2022).

Nevertheless, thanks to the availability of continuous scans of the sky with instruments that are capable of achieving unprecedented resolution, it is natural to expect that new and interesting astrophysical sources will continue to be detected. The challenge then becomes developing automated unsupervised learning strategies that can successfully identify such sources among large and complex data sets. The astronomical community has devoted significant efforts to this direction. For example, Pruzhinskaya et al. (2019) applied the isolation forest (IF, Liu et al. 2008) algorithm to identify contaminants in the Open Supernova Catalog (Guillochon et al. 2017). Malanchev et al. (2021a) used four different anomaly detection (AD) algorithms and a comprehensive feature extraction process to identify unusual light curves in the third ZTF data release (DR). In searching for changing-state active galactic nuclei (AGNs), Sánchez-Sáez et al. (2021) identified 75 promising candidates by combining dimensionality reduction via deep learning with IF. Storey-Fisher et al. (2021) applied a Wasserstein generative adversarial network on nearly one million optical galaxy images in the Hyper Suprime-Cam survey. Martínez-Galarza et al. (2021) combined tree-based AD and manifold learning to identify sets of unusual light curves in Kepler data. Chan et al. (2022) applied a similar strategy to identify anomalous periodic variables in ZTF data. Sarkar et al. (2022) used the Earth as an anomaly example in order to estimate the habitability of exoplanets using a multi-stage memetic algorithm. Kovačević et al. (2022) used self-organising maps to analyse temporal-only parameters computed from ⪆105 sources from the Exploring the X-ray Transient and variable Sky catalogue and Aleo et al. (2022b) used simulated light curves to search for counterparts in ZTF DR4, identifying 11 non-catalogued transients.

Despite such promising results, all AD studies need to deal with the discrepancy between the statistical definition of an outlier (which directly affects the output from traditional machine learning models) and astrophysically interesting anomalies1 (unforeseen or yet to be confirmed events generated by unusual astrophysical phenomena). In large data sets, outliers tend to dominate the set of objects with high anomaly scores (Malanchev et al. 2021a). Adaptive learning techniques are aimed at sequentially incorporating expert knowledge in machine learning models (see e.g. Ishida et al. 2021; Lochner & Bassett 2021). The SNAD team2 has been consistently improving and testing such an adaptive learning strategy, whereby at each iteration, a binary reply from the expert is incorporated into the weight calculation of an IF model, producing updated anomaly scores. The active anomaly discovery (AAD, Das et al. 2017) algorithm has proven to be effective in its first application to real data (Ishida et al. 2021). In this work, we stress test the effectiveness of this strategy by applying it to light curves from ZTF DR3. Considering as anomalies any light curves that resemble those of supernovae (SNe), our experts scanned 70 ZTF fields searching for uncatalogued or anomalous transients.

This paper is organised as follows. Section 2 describes the data selection process (Sect. 2.1), learning algorithm (Sect. 2.2), and a summary of the results (Sect. 2.3). In Sect. 3, we present the results of our light-curve modelling for a subset of the newly reported transients. Section 4 presents an in-depth discussion on superluminous supernova (SLSN) candidates (Sect. 4.1), along with a complete set of labels within the SNAD viewer knowledge database (Sect. 4.2) and a description of other non-catalogued objects found during our search (Sect. 4.3). We present our conclusions in Sect. 5. Additionally, the complete SNAD catalogue of discovered transients is shown in Appendix A. Appendix B shows light curves and corresponding fit models for SNAD objects. Appendix C gives a glimpse of the domain knowledge database within the SNAD viewer3.

2 Supernova search

2.1 ZTF data and field selection

We analysed photometric data from the first 9.4 months of the ZTF survey, between 2018 March 17 and December 31 (58194 ≤ MJD ≤ 58483). This period includes data from the ZTF private survey, thus offering a better cadence than the rest of DR34. However, the expert analysis of discovered SNe (see Sect. 3) used more complete light curves from ZTF DR8.

Given the higher probability of finding SNe in low extinction regions, we analysed only those fields with centres at > 20° above the galactic plane. The distribution of the 70 fields considered in this work is given in Fig. 1.

Each one of the selected fields contains from a few thousand to a little more than a million objects with at least 100 photometric points in zr-band (catflags = 0), thus comprising ~26.5 million light curves in total. Each object is characterised by ZTF Object ID (OID). This identifier is unique only within each field and each band, therefore the same source observed in different fields and in different bands can have several OIDs.

Per each OID, we extracted 42 zr-band light curve features including magnitude amplitude, Stetson K coefficient (Stetson 1996), standard deviation of Lomb–Scargle periodogram (Lomb 1976; Scargle 1982), and others. A full description of all features used is given in Malanchev et al. (2021b,a).

2.2 Active anomaly discovery

Recommendation systems are automatic algorithms whose goal is to minimise the cost of labelling tasks and, at the same time, to optimise classification or anomaly detection results. In this work, we use the AAD algorithm proposed by Das et al. (2017). It starts with a traditional IF and sequentially presents the object with highest anomaly score to the expert. If the expert judges a particular outlier not to be interesting, the weights of each decision path is changed to accommodate this new information and the data is passed through the slightly modified forest. The process is repeated until a certain budget has been reached. This framework was first applied to a simulated as well as a small real data set by Ishida et al. (2021).

Here, we present the first application of AAD to a significantly larger data set of real observations (~26.5 million light curves). Since the algorithm can adapt to the expert’s opinion, it can be used for a targeted search of transients of a certain type (e.g. SNe). Therefore, in this analysis, a human expert considered only SN-like candidates as anomalies; all other objects proposed by the algorithm are rejected by the expert as ’uninteresting’ (i.e. ‘yes’ and ‘no’ in the AAD interface). For each field, the expert has gone through a total budget of 30 objects.

In order to enable a smooth interaction between our experts and the AAD algorithm when dealing with such a large data set, we developed the SNAD knowledge database (Malanchev et al. 2023), a framework used by our experts to log their input as one entry in a tailored set of labels (see further details in Sect. 4.2). For each one of the ZTF fields, our experts went through 30 objects registering their feedback as a binary answer. The distribution of objects by type for each of the 35 fields containing SNe or SN candidate is given in Fig. 2. Each line represents one AAD run with 30 queries in the order of appearance to the expert. The colour denotes the assigned tag, namely, whether it is a supernova, artefact, or other type of object.

In what follows, we further investigate the most interesting objects we encountered. The source code is publicly available as a part of zwad (Malanchev et al. 2021b) GitHub repository5.

2.3 Results

We visually inspected 2100 (70 × 30) outliers. Among them, we found 104 SN-like objects, 57 of which were reported for the first time and 47 were previously mentioned in other catalogues, either as SNe of known types or as SN candidates (see Sect. 4.2 for other type of objects found). Sources which were not previously mentioned in the Transient Name Server6 (TNS) received an internal SNAD name, were added to the SNAD catalogue7, and reported to TNS. The full list of transients found by the AAD algorithm is given in Table A.l. Column 1 contains the internal SNAD names of the non-catalogued SN-like candidates. The equatorial coordinates are given in Cols. 2 and 3. In Col. 4, we list the ZTF OIDs. The suggested transient type is defined in Col. 5. If the object also exists in ZTF alerts, the corresponding target alert name is given in Col. 6. Column 7 contains the TNS name. Column 8 reports OIDs output by the pipeline that corresponds to the same astrophysical source.

Figure 1 shows the distribution of inspected fields on the sky in equatorial coordinates, along with the corresponding number of objects. There are 35 fields with detected SN candidates that are outlined in black. Naively, we would expect that fields with supernovae should be concentrated at the regions further away from the galactic plane and galactic centre. However, we observe that they are located in the middle galactic longitude and latitude. This can be explained by the smaller number of observations in more extragalactic regions. Moreover, the number of objects in different fields varies from a few thousand to more than a million, and the fact that we did not detect any SN in regions with more than a million objects (only three regions), which are also very close to the Milky Way centre, may indicate that the budget of 30 objects was not enough for the AAD to adapt and ideally should be scaled according to the number of objects in the field.

Among the previously reported supernovae candidates, there are 14 SNe Ia, 13 possible SNe, 7 SNe II, 3 SNe Ic, 2 SNe IIP, and 1 SN Ib; the remaining 7 catalogued SNe belong to the rare supernova classes considered as anomalies in Pruzhinskaya et al. (2019); Ishida et al. (2021), namely, 2 SNe IIb, 1 SN la Pec, 1 SN Ia-91bg, 1 SN Ic BL, 1 SN IIn, and 1 SLSN-I. To compare the efficiency of the AAD algorithm in searching for more rare and therefore potentially interesting objects, we recorded the number of spectroscopically confirmed SNe found in this work and discovered by different groups, in the ZTF data, according to TNS for the same period of time (58194 ≤ MJD ≤ 58483), as shown in Table 1. The fraction of rare SN types among the total number is ~2l% for AAD discoveries and ~ 10% for general TNS findings.

Non-catalogued SN-like objects are listed in the beginning of Table A.l. We note that 15 SNAD possible supernovae (PSNe) are missing in the official ZTF alert stream (Table A.l, Col. 6). Missed transients have peak zr magnitude ~ 19.5–20 mag, which is indeed quite low, but still compatible with those of some other SNAD transients detected by the alert system. Furthermore, some of our candidates (e.g. SNAD128, SNAD165) have well-sampled early light curves which is of interest for surveys such as the Young Supernova Experiment (Jones et al. 2021).

thumbnail Fig. 1

Sky map in equatorial coordinates with plotted positions of ZTF fields analysed in this work, the colour bar shows the number of objects in each field. Fields with detected supernova candidates are highlighted with bold black boundaries. The blue curve denotes the galactic plane. The black triangle marks the galactic centre and the black circle corresponds to the position of the Andromeda galaxy.

thumbnail Fig. 2

Distribution of objects by type for each of the 35 fields containing supernova or supernova candidate. Each line represents one AAD run with 30 queries. Red, green, and beige colours denote the supernova candidate, artefact or other type of object, respectively. Fields are ordered by the number of objects in them, from 135 681 (bottom line) to 856 453 (top line).

3 Supernova modelling

We used the PYTHON library SNCOSMO8 to obtain a preliminary photometric classification for SNAD objects. Their light curves were fitted with Peter Nugent’s supernova models9, which cover the main SN types (Ia, Ib/c, IIP, IIL, Iln). Nugent’s models are simple spectral time series that can be scaled up and down. The model parameters are the redshift, z, the observer-frame time corresponding to the source’s zero phase, t0, and the amplitude. The zero phase is defined relative to the explosion moment and the observed time, t, is related to phase via t = t0 + phase × (1 + z).

In order to perform a preliminary fit, we used only the zr-band from DR8. We subtracted the reference magnitude from ZTF light curves, thus roughly accounting for the host galaxy contamination. The reference magnitude was retrieved from ZTF archival data10 and listed in the SNAD catalogue7. We also corrected for a line-of-sight reddening in the Milky Way galaxy using Schlaffy & Finkbeiner (2011) estimates. For sources holding SDSS DR16 (Ahumada et al. 2020) photometric redshift of a host galaxy at the source position, we fixed the redshift to this value. If this was not available, we adopted [−15; −22] as an acceptable range for the supernova absolute magnitude (Richardson et al. 2014) and then, using the maximum apparent magnitude, roughly transformed it to the corresponding redshift range. We applied a χ2 criterion to choose the best-fit model for each SNAD object. Results of the light curve fit are given in Appendix B, the best-fit model for each SNAD transient is listed in Col. 5 of Table A.l.

It should be noted that we did not intend to make a detailed fit, but, rather, to show that the candidate light curves, selected initially by eye, can be satisfactorily fitted by different supernova models. That is why only one band (zr) has been used in the fit. Also, we did not take into account the possible extinction in host galaxies of the candidates, therefore, our fit is less accurate for highly reddened objects. Moreover, the redshift we assigned to some host galaxies is photometric, which is another source of uncertainty. Finally, the model itself is rather simple and limited in wavelength and time range. As a result of these conscious simplifications and assumptions, the obtained absolute magnitude for some of the objects is not typical for normal SNe (e.g. SNAD 122, Mr(IIP) ≃ −22.6 mag) and we cannot trust the classification in those cases. However, this simple fit is enough to show that a few transients have anomalously wide light curves when compared to normal SNe, making them candidates to the SLSN class (Sect. 4.1).

Although this classification should be treated with caution, it follows closely the behaviour of light curves with a sufficient number of observations before and after maximum light. Using the SNCOSMO library, we also performed a multi-band light-curve fit for a few objects with the models suggested by the preliminary classification. The parameters of the fit are z, t0, and the amplitude. Then, SNAD112, SNAD142, SNAD165, and SNAD137 fitted by Nugent’s Type Ia, IIP, Ibc, and IIn models are given in Figs. 36, respectively. The quality of the fit allows us to conclude that those supernovae belong to the suggested types.

Table 1

Sub-populations of spectroscopically confirmed supernovae, found in this work (AAD) and total reported in TNS (TNS) for the same time period.

thumbnail Fig. 3

Light curve fit of SNAD 112 by Nugent’s Type la supernova model. Observational data correspond to OIDs: 79610140003999 (zg), 796201400007564 (zr), 796301400021075 (zi), and 797304300009092 (zi).

thumbnail Fig. 4

Light curve fit of SNAD142 by Nugent’s Type IIP supernova model. Observational data correspond to OIDs: 826102200028756 (zg), 826202200030732 (zr), and 826302200021568 (zi).

thumbnail Fig. 5

Light curve fit of SNAD 165 by Nugent’s Type Ibc supernova model. Observational data correspond to OIDs: 763104300002058 (zg), 763204300004087 (zr), and 763304300014301 (zi).

thumbnail Fig. 6

Light curve fit of SNAD137 by Nugent’s Type IIn supernova model. Observational data correspond to OIDs: 825102200009050 (zg), 825202200039582 (zr), and 825302200018371 (zi).

4 Discussion

4.1 Superluminous supernovae candidates

Four supernova candidates from our list possess significantly broader light curves in comparison with Nugent’s models and other candidates: SNAD120, SNAD121, SNAD160, and SNAD187 (see Appendix B). In this section we explore the possibility of these objects belonging to the SLSN class.

SNAD120 (AT20l8lxa) is located at α = 17h00m16.296s, δ = +70°30′49.55″. In the official ZTF alert stream, it is denoted as ZTF18aazydub. According to Strotjohann et al. (2021), the transient has a spectroscopic redshift of zsp = 0.202 and was classified as SN IIn. Assuming this redshift, the estimated absolute magnitude at maximum brightness is Mr ≃ −20.5 mag, which is slightly dimmer than the threshold of −21 mag established for SLSNe (Gal-Yam 2012).

SNAD121 (AT20l8lxb, ZTFl8abklshn) is located at α = 16h33m19.937s, δ = +71°06′54.50″. On archival images provided by the Legacy Surveys Sky Viewer11, a possible host is detected with an estimated photometric redshift of zph = 0.240 + 0.166 (Zhou et al. 2021). Taking into account redshift uncertainty, the absolute magnitude of this source is estimated to be brighter than −21 mag, thus, it is compatible with SLSNe.

SNAD160 (AT20l81zi, ZTFl8aautopz) is located at α = 13h43m53.357s, δ = +61°33′17.24″. The ALeRCE ZTF Explorer12 automatically classified ZTFl8aautopz as a SLSN. The spectroscopic redshift is zsp = 0.295 (Strotjohann et al. 2021), which gives Mr ≃ −21.6 mag at maximum light. SNAD 160 is reported by SNAD team in Pruzhinskaya et al. (2022) as a possible pair-instability supernova – a theoretical class of thermonuclear explosions which takes place at the end of life of very massive stars with highly increased production of 56Ni (e.g. Gal-Yam 2019; Kozyreva et al. 2014).

SNAD187 (AT20l8mcb, ZTF18aaqctvg) is located at α = 13h53m7.366s, δ = +40°48′7.42″. There are several photometric redshift estimations of its possible host provided by different surveys: zph = 0.204 + 0.084 by the Legacy Surveys Sky Viewer (Zhou et al. 2021), zph = 0.343 + 0.128 by SDSS DR16 (Ahumada et al. 2020), and zph ≃ 0.201 by Gaia DR3 (Gaia Collaboration 2022). Also, according to Gaia variability classification results there is an AGN at the transient position (Gaia Collaboration 2022). It is possible that SNAD187 is not associated with the host AGN activity and could be a SLSN. Recently, the ANTARES broker AD filter reported the discovery of a SLSN – SN 2022mnj at the central region of an AGN (Aleo et al. 2022a; Ashall 2022; see also Moriya et al. 2017).

Figure 7 shows the observed light curves of SNAD 120, SNAD121, SNAD160, and SNAD187 in the zr-band in comparison with SN 2006gy (Smith et al. 2007) – one of the brightest among the well-studied SLSNe, shifted to z = 0.3 and 0.4. SN 2006gy has a very broad light curve, but it is clear that the SNAD candidates have even broader light curves, making them really peculiar objects among known SNe. The discovery of four slow-evolving transients among the SNAD objects, non-reported by previous searches provides clear evidence that the AAD is efficient in searching for rare classes of astronomical objects within large and complex data sets.

4.2 SNAD knowledge database

Beyond the transient candidates discussed previously, this work also produced a valuable knowledge database incorporated within the SNAD viewer (Malanchev et al. 2023). The viewer is a specially designed web-interface, which allows the expert to visualise ZTF DR light curves, provides access to the individual exposure images, and performs cross-matches with different databases and catalogues. For authorised users, there is a possibility to assign the labels (tags) to ZTF objects (see Fig. C.1).

We defined a system of tags that includes some general classes: variable star of unspecified type (VAR), transient (TRANSIENT), active galactic nucleus (AGN), quasar (QSO), normal star without strong variability (STAR), and galaxy (GALAXY), as well as the most popular types and subtypes of variable stars and transients13, such as:

  • Supernova (SN): Type Ia supernova (SNIA), core-collapse supernova (CCSN), and super-luminous supernova (SLSN);

  • Eclipsing variable (ECLIPSING): β Persei-type (Algol) eclipsing system (EA), β Lyrae-type eclipsing system (EB), and W Ursae Majoris-type eclipsing variable (EW);

  • Pulsating variable (PULSATING): cepheid (CEP), classical cepheid or δ Cephei-type variable (DCEP), slow irregular variable (L), long period variable (LPV), o Ceti-type (Mira) variable (M), variable of the RR Lyrae type (RR), RR Lyrae variable with asymmetric light curve (RRAB), red super-giant (RSG), semi-regular variable (SR), and variable of the δ Scuti type (DSCT);

  • Cataclysmic variable (CATACLYSMIC): AM Herculis-type variable (AM), nova (N), U Geminorum-type variable or dwarf nova (UG), SS Cygni-type variable (UGSS), and Z Camelopardalis-type star (UGZ);

  • Eruptive variable (ERUPTIVE): Orion variable with rapid light variations (INS), variable of the S Doradus type (SDOR), T Tauri star (TTS), young stellar object of unspecified variable type (YSO), M dwarf flare (M_DWARF_FLARE);

  • Rotating variable (ROTATING): BY Draconis-type variable (BY), RS Canum Venaticorum-type binary system (RSCVN).

There are a few custom tags for internal purposes, such as transients with one outlier point (1-POINT) or candidates to be send to TNS (TNS_CANDIDATE). Also, tags of non-astrophysical origin such as artefacts and their subtypes are present. Several tags can be assigned to one object, the history of tag changes is also stored in the database (Fig. C.1, on the right).

The choice of tags is determined by the experts, based on the most frequent types of objects appearing in the output of the AD algorithms and also determined by the project needs. Therefore, we do not claim to be complete in covering all possible types of variables and transients.

During the supernova search a total of 1482 objects were labelled. Despite the fact that ZTF data processing pipeline includes a procedure to separate the astrophysical events from bogus ones, namely, false positive detections (Masci et al. 2019), fields with SNe consists of ~45% of artefacts. Examples of found artefacts are given in Fig. 814.

For real variables, among the most common types in fields containing SNe, are eclipsing (N = 51, ~5%) and pulsating (N = 53, ~5%) variables, as well as AGNs (N = 176, ~17%). The assigned labels can be used to further improve the ZTF pre-processing pipeline (in case of artefacts) as well as for machine-learning classification tasks (in case of astrophysical labels).

thumbnail Fig. 7

Light curves of SNAD SLSN candidates in zr-band in comparison with the R-band light curve of well-studied SLSN SN 2006gy shifted to z = 0.3 (black pluses) and z = 0.4 (black crosses). The observed magnitudes of SN 2006gy are taken from Smith et al. (2007). All the light curves are shown relative to the maximum light.

4.3 Othernon-catalogued objects

During the supernova search, a number of interesting non-catalogued objects of other types have been found. Among those there are red dwarf flares, namely, transients caused by the sudden release of stored magnetic energy from surface magnetic loops into the outer stellar atmosphere (Pettersen 1989; Haisch et al. 1991), and AGNs. For example, a two-peak flare of a red dwarf, OID = 726209400028833, located at a distance of ~162 pc (Bailer-Jones et al. 2018) is shown in Fig. 9. The amplitude of the flare is ~1.8 mag, the minimum duration is ~46 min. There are many unsolved questions related to flare physics, red dwarf distribution in the Galaxy, and habitability of host planets, which can benefit from a systematic study of a large sample of such events (e.g. Segura et al. 2010; Engle & Guinan 2011; France et al. 2013; Webb et al. 2021). Moreover, good observational cadence of the flare (~70 points in 46 min) also opens up a possibility to search for fast transients in ZTF data.

Another interesting object, OID = 676213300006792, located at a distance of ~234 pc (Bailer-Jones et al. 2018), shows two outbursts, one of which is observed at a high frequency (see Fig. 10). Based on its SDSS spectrum, 676213300006792 was previously identified as a white dwarf-main sequence binary with a secondary M-dwarf companion (Liu et al. 2012). 676213300006792 is a weak UV source and does not appear in any X-ray database. Its SDSS spectrum does not show a significant Hα emission. The high cadence zg-band light curve shows a periodicity with P ≃ 4.25 min, just before the flare. We assume that there is no stable mass transfer in the system, and the M-dwarf has not overflowed its Roche lobe. We attribute the outbursts to the low accretion rate of the unstable stellar wind on the white dwarf during the increase in magnetic activity from the M-dwarf. The periodic variation before the flare may be related to a hot spot in the temporary accretion disc.

Other non-catalogued objects include candidates for AGNs (e.g. Fig. 11) and variable stars of different nature (e.g. eclipsing binary candidate in Fig. 12). All these objects can be studied separately in the future by the domain experts.

5 Conclusions

In this work, we provide the first results from the complete SNAD adaptive learning pipeline in the presence of big data from large-scale astronomical surveys. The SNAD team became aware of the existence of non-reported supernova candidates within the ZTF DRs once they appeared in a non-targeted anomaly detection search (Malanchev et al. 2021a). A new experiment was then designed to develop a tailored machine learning model which would explore this possibility by taking advantage of the SNAD adaptive learning pipeline (Ishida et al. 2021) and our experts’ long-term experience studying supernovae.

We selected 70 ZTF fields in high galactic latitude, employed a series of quality cuts followed by designed feature extraction (Sect. 2.1). The resulting homogeneous feature sets (one per field) were submitted independently of 30 iterations of the active anomaly discovery algorithm, where at each iteration, the domain expert would input a positive feedback to any outlier whose light curve resembles a SN and a negative one otherwise. During this process, human-assigned labels were added to the SNAD knowledge database, opening the way for future deeper analysis of the same data (Sect. 4.2). From the 2100 objects visually inspected, we found 104 SN-like events, 57 of which were reported for the first time. These transients received an internal name, were reported to TNS and added to the SNAD catalogue7 (see Sect. 2.3).

In order to evaluate probable classification types for the newly found transients, we performed light curve fits using different supernova models (Sect. 3). Among the newly found transients, we reported three objects (SNAD121, SNAD160 and SNAD187) with broad, slowly evolving light curves that stand as promising superluminous supernova candidates (see Fig. 7 and Pruzhinskaya et al. 2022).

Despite the fact that the AAD was aimed at supernova search, other potentially interesting objects have been found, including non-catalogued AGNs and red dwarf flares. The high cadence data of discovered flares opens the possibility of searching for fast transients in ZTF. Moreover, the visual inspection of AAD outliers during the SN search led to the creation of the SNAD knowledge database that can be used for different machine learning tasks in the future15.

The overall efficiency of the pipeline is highly dependent on the total number of objects being analysed, feature choices, and maximum iterations budget, among other parameters. Nevertheless, the results presented here confirm the effectiveness of adaptive learning approaches in filtering large astronomical data sets for expert analysis. They reveal important characteristics of ZTF data releases that ought to be further scrutinised to avoid similar losses in the future (Aleo et al. 2022b).

thumbnail Fig. 8

Examples of artefacts found during the supernova search with AAD. The outlier is located in the image centre. The image sizes are 600×600, 100×100, and 200×200 CCD pixels, respectively.

thumbnail Fig. 9

Light curve of a complex red dwarf flare, OID: 72620940028833 (zr). Inset plot shows a zoomed high-cadence light curve with two-peaks flare.

thumbnail Fig. 10

Light curves of a white dwarf-M-dwarf binary system. Observational data correspond to OIDs: 676113300003418 (zg), 676213300006792 (zr), and 676313300009030 (zi). Inset plot shows a zoomed high-cadence zg-band light curve with flare and possible periodicity.

thumbnail Fig. 11

Light curves of an AGN candidate. Observational data correspond to OIDs: 763114100009685 (zg), 763214100020128 (zr), and 763314100028589 (zi).

thumbnail Fig. 12

Folded light curves of a non-catalogued eclipsing binary candidate. Observational data correspond to OIDs: 791113400002334, 792116300002277 (zg); 7912134000092 57, 792216300003876 (zr); 791313400005610, and 792316300013740 (zi).

Acknowledgements

We thank Anastasia Voloshina and Alexandra Zubareva for the assistance in variable star classification and analysis. We also thank Stephane Blodin and Alexandra Kozyreva for discussion involving PISN modelling. The reported study was funded by RFBR and CNRS according to the research project 𝒩o. 21-52-15024. We used the equipment funded by the Lomonosov Moscow State University Program of Development. The authors acknowledge the support by the Interdisciplinary Scientific and Educational School of Moscow University “Fundamental and Applied Space Research”. P.D.A. is supported by the Center for Astrophysical Surveys (CAPS) at the National Center for Supercomputing Applications (NCSA) as an Illinois Survey Science Graduate Fellow. V.V.K. is supported by the Ministry of science and higher education of Russian Federation, topic no. FEUZ-2020-0038. E.E.O.I. received financial support from CNRS International Emerging Actions under the project Real-time analysis of astronomical data for the Legacy Survey of Space and Time during 2021-2022.

Appendix A AAD results

We report the complete set of SN-like transients shown to the expert by the AAD pipeline below.

Table A.1

Complete list of supernovae and supernova candidates found by active anomaly discovery algorithm in ZTF DR3.

Appendix B Light curves of the SNAD supernova candidates

We present below a subset of SNAD candidates and their respective light curve fits (Section 2.3).

thumbnail Fig. B.1

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

thumbnail Fig. B.2

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

thumbnail Fig. B.3

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

thumbnail Fig. B.4

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

thumbnail Fig. B.5

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

thumbnail Fig. B.6

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

thumbnail Fig. B.7

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

thumbnail Fig. B.8

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

thumbnail Fig. B.9

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

thumbnail Fig. B.10

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

Appendix C SNAD ZTF viewer

We present below a glimpse on the expert’s session of the SNAD viewer16 (Malanchev 2021b; Malanchev et al. 2023).

thumbnail Fig. C.1

SNAD ZTF viewer tags and logs on the example of SNAD178.

References

  1. Ahumada, R., Allende Prieto, C., Almeida, A., et al. 2020, ApJS, 249, 3 [NASA ADS] [CrossRef] [Google Scholar]
  2. Aleo, P., Lee, C., Malanchev, K., et al. 2022a, Transient Name Server Discovery Report, 2022-1633, 1 [Google Scholar]
  3. Aleo, P. D., Malanchev, K. L., Pruzhinskaya, M. V., et al. 2022b, New A, 96, 101846 [NASA ADS] [CrossRef] [Google Scholar]
  4. Ashall, C. 2022, Transient Name Server Classification Report, 2022-1690, 1 [Google Scholar]
  5. Bailer-Jones, C. A. L., Rybizki, J., Fouesneau, M., Mantelet, G., & Andrae, R. 2018, AJ, 156, 58 [Google Scholar]
  6. Baron, D. 2019, arXiv e-prints, [arXiv:1904.07248] [Google Scholar]
  7. Bellm, E. C., Kulkarni, S. R., Graham, M. J., et al. 2019, PASP, 131, 018002 [Google Scholar]
  8. Blanton, M. R., Bershady, M. A., Abolfathi, B., et al. 2017, AJ, 154, 28 [Google Scholar]
  9. Cabayol, L., Eriksen, M., Amara, A., et al. 2021, MNRAS, 506, 4048 [NASA ADS] [CrossRef] [Google Scholar]
  10. Carleo, G., Cirac, I., Cranmer, K., et al. 2019, Rev. Mod. Phys., 91, 045002 [NASA ADS] [CrossRef] [Google Scholar]
  11. Chan, H.-S., Villar, V. A., Cheung, S.-H., et al. 2022, ApJ, 932, 118 [NASA ADS] [CrossRef] [Google Scholar]
  12. Chen, S.-X., Sun, W.-M., & He, Y. 2022, Res. Astron. Astrophys., 22, 025017 [CrossRef] [Google Scholar]
  13. Das, S., Wong, W.-K., Fern, A., Dietterich, T. G., & Amran Siddiqui, M. 2017, arXiv e-prints [arXiv:1708.09441] [Google Scholar]
  14. Engle, S. G., & Guinan, E. F. 2011, ASP Conf. Ser., 451, 285 [NASA ADS] [Google Scholar]
  15. France, K., Froning, C. S., Linsky, J. L., et al. 2013, ApJ, 763, 149 [Google Scholar]
  16. Gaia Collaboration 2022, VizieR Online Data Catalog: I/356 [Google Scholar]
  17. Gal-Yam, A. 2012, Science, 337, 927 [Google Scholar]
  18. Gal-Yam, A. 2019, ARA&A, 57, 305 [Google Scholar]
  19. Guillochon, J., Parrent, J., Kelley, L. Z., & Margutti, R. 2017, ApJ, 835, 64 [Google Scholar]
  20. Haisch, B., Strong, K. T., & Rodono, M. 1991, ARA&A, 29, 275 [Google Scholar]
  21. Henghes, B., Pettitt, C., Thiyagalingam, J., Hey, T., & Lahav, O. 2021, MNRAS, 505, 4847 [CrossRef] [Google Scholar]
  22. Ishida, E. E. O. 2019, Nat. Astron., 3, 680 [NASA ADS] [CrossRef] [Google Scholar]
  23. Ishida, E. E. O., Kornilov, M. V., Malanchev, K. L., et al. 2021, A&A, 650, A195 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  24. Jones, D. O., Foley, R. J., Narayan, G., et al. 2021, ApJ, 908, 143 [NASA ADS] [CrossRef] [Google Scholar]
  25. KovaCevic, M., Pasquato, M., Marelli, M., et al. 2022, A&A, 659, A66 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  26. Kozyreva, A., Blinnikov, S., Langer, N., & Yoon, S. C. 2014, A&A, 565, A70 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  27. Krone-Martins, A., Ishida, E. E. O., & de Souza, R. S. 2014, MNRAS, 443, L34 [NASA ADS] [CrossRef] [Google Scholar]
  28. Liu, F. T., Ting, K. M., & Zhou, Z.-H. 2008, in 2008 Eighth IEEE International Conference on Data Mining, 413 [CrossRef] [Google Scholar]
  29. Liu, C., Li, L., Zhang, F., et al. 2012, MNRAS, 424, 1841 [NASA ADS] [CrossRef] [Google Scholar]
  30. Lochner, M., & Bassett, B. A. 2021, Astron. Comput., 36, 100481 [NASA ADS] [CrossRef] [Google Scholar]
  31. Lomb, N. R. 1976, Ap&SS, 39, 447 [Google Scholar]
  32. Malanchev, K. 2021a, Astrophysics Source Code Library, [record ascl:2107.001] [Google Scholar]
  33. Malanchev, K. L. 2021b, Astrophysics Source Code Library, [record ascl:2106.034] [Google Scholar]
  34. Malanchev, K. L., Pruzhinskaya, M. V., Korolev, V. S., et al. 2021a, MNRAS, 502, 5147 [Google Scholar]
  35. Malanchev, K. L., Pruzhinskaya, M. V., Korolev, V. S., et al. 2021b, Astrophysics Source Code Library, [record ascl:2106.033] [Google Scholar]
  36. Malanchev, K., Kornilov, M. V., Pruzhinskaya, M. V., et al. 2023, PASP, 135, 024503 [CrossRef] [Google Scholar]
  37. Malik, A., Moster, B. P., & Obermeier, C. 2022, MNRAS, 513, 5505 [NASA ADS] [Google Scholar]
  38. Martínez-Galarza, J. R., Bianco, F. B., Crake, D., et al. 2021, MNRAS, 508, 5734 [CrossRef] [Google Scholar]
  39. Masci, F. J., Laher, R. R., Rusholme, B., et al. 2019, PASP, 131, 018003 [Google Scholar]
  40. Moriya, T. J., Tanaka, M., Morokuma, T., & Ohsuga, K. 2017, ApJ, 843, L19 [NASA ADS] [CrossRef] [Google Scholar]
  41. Pasquet, J., Bertin, E., Treyer, M., Arnouts, S., & Fouchez, D. 2019, A&A, 621, A26 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  42. Pettersen, B. R. 1989, Sol. Phys., 121, 299 [NASA ADS] [CrossRef] [Google Scholar]
  43. Pruzhinskaya, M. V., Malanchev, K. L., Kornilov, M. V., et al. 2019, MNRAS, 489, 3591 [Google Scholar]
  44. Pruzhinskaya, M., Volnova, A., Kornilov, M., et al. 2022, RNAAS, 6, 122 [NASA ADS] [Google Scholar]
  45. Richardson, D., Jenkins, Robert L.I., Wright, J., et al. 2014, AJ, 147, 118 [NASA ADS] [CrossRef] [Google Scholar]
  46. Sánchez-Sáez, P., Lira, H., Martí, L., et al. 2021, AJ, 162, 206 [CrossRef] [Google Scholar]
  47. Sarkar, J., Bhatia, K., Saha, S., Safonova, M., & Sarkar, S. 2022, MNRAS, 510, 6022 [CrossRef] [Google Scholar]
  48. Scargle, J. D. 1982, ApJ, 263, 835 [Google Scholar]
  49. Schlafly, E. F., & Finkbeiner, D. P. 2011, ApJ, 737, 103 [Google Scholar]
  50. Segura, A., Walkowicz, L. M., Meadows, V., Kasting, J., & Hawley, S. 2010, Astrobiology, 10, 751 [Google Scholar]
  51. Smith, N., Li, W., Foley, R. J., et al. 2007, ApJ, 666, 1116 [NASA ADS] [CrossRef] [Google Scholar]
  52. Stetson, P. B. 1996, PASP, 108, 851 [NASA ADS] [CrossRef] [Google Scholar]
  53. Storey-Fisher, K., Huertas-Company, M., Ramachandra, N., et al. 2021, MNRAS, 508, 2946 [NASA ADS] [CrossRef] [Google Scholar]
  54. Strotjohann, N. L., Ofek, E. O., Gal-Yam, A., et al. 2021, ApJ, 907, 99 [NASA ADS] [CrossRef] [Google Scholar]
  55. Webb, S., Flynn, C., Cooke, J., et al. 2021, MNRAS, 506, 2089 [NASA ADS] [CrossRef] [Google Scholar]
  56. Zhou, R., Newman, J. A., Mao, Y.-Y., et al. 2021, MNRAS, 501, 3309 [NASA ADS] [CrossRef] [Google Scholar]

1

Nomenclature from Malanchev et al. (2021a).

13

Variable star types follow the convention used by the International Variable Star Index, https://www.aavso.org/vsx/index.php?view=about.vartypes

14

The SNAD catalogue of selected artefacts found in ZTF data is available at https://snad.space/art/

15

Results from the Active Anomaly Discovery algorithm and light-curve feature set are available in Zenodo, at https://zenodo.org/record/6998913.

All Tables

Table 1

Sub-populations of spectroscopically confirmed supernovae, found in this work (AAD) and total reported in TNS (TNS) for the same time period.

Table A.1

Complete list of supernovae and supernova candidates found by active anomaly discovery algorithm in ZTF DR3.

All Figures

thumbnail Fig. 1

Sky map in equatorial coordinates with plotted positions of ZTF fields analysed in this work, the colour bar shows the number of objects in each field. Fields with detected supernova candidates are highlighted with bold black boundaries. The blue curve denotes the galactic plane. The black triangle marks the galactic centre and the black circle corresponds to the position of the Andromeda galaxy.

In the text
thumbnail Fig. 2

Distribution of objects by type for each of the 35 fields containing supernova or supernova candidate. Each line represents one AAD run with 30 queries. Red, green, and beige colours denote the supernova candidate, artefact or other type of object, respectively. Fields are ordered by the number of objects in them, from 135 681 (bottom line) to 856 453 (top line).

In the text
thumbnail Fig. 3

Light curve fit of SNAD 112 by Nugent’s Type la supernova model. Observational data correspond to OIDs: 79610140003999 (zg), 796201400007564 (zr), 796301400021075 (zi), and 797304300009092 (zi).

In the text
thumbnail Fig. 4

Light curve fit of SNAD142 by Nugent’s Type IIP supernova model. Observational data correspond to OIDs: 826102200028756 (zg), 826202200030732 (zr), and 826302200021568 (zi).

In the text
thumbnail Fig. 5

Light curve fit of SNAD 165 by Nugent’s Type Ibc supernova model. Observational data correspond to OIDs: 763104300002058 (zg), 763204300004087 (zr), and 763304300014301 (zi).

In the text
thumbnail Fig. 6

Light curve fit of SNAD137 by Nugent’s Type IIn supernova model. Observational data correspond to OIDs: 825102200009050 (zg), 825202200039582 (zr), and 825302200018371 (zi).

In the text
thumbnail Fig. 7

Light curves of SNAD SLSN candidates in zr-band in comparison with the R-band light curve of well-studied SLSN SN 2006gy shifted to z = 0.3 (black pluses) and z = 0.4 (black crosses). The observed magnitudes of SN 2006gy are taken from Smith et al. (2007). All the light curves are shown relative to the maximum light.

In the text
thumbnail Fig. 8

Examples of artefacts found during the supernova search with AAD. The outlier is located in the image centre. The image sizes are 600×600, 100×100, and 200×200 CCD pixels, respectively.

In the text
thumbnail Fig. 9

Light curve of a complex red dwarf flare, OID: 72620940028833 (zr). Inset plot shows a zoomed high-cadence light curve with two-peaks flare.

In the text
thumbnail Fig. 10

Light curves of a white dwarf-M-dwarf binary system. Observational data correspond to OIDs: 676113300003418 (zg), 676213300006792 (zr), and 676313300009030 (zi). Inset plot shows a zoomed high-cadence zg-band light curve with flare and possible periodicity.

In the text
thumbnail Fig. 11

Light curves of an AGN candidate. Observational data correspond to OIDs: 763114100009685 (zg), 763214100020128 (zr), and 763314100028589 (zi).

In the text
thumbnail Fig. 12

Folded light curves of a non-catalogued eclipsing binary candidate. Observational data correspond to OIDs: 791113400002334, 792116300002277 (zg); 7912134000092 57, 792216300003876 (zr); 791313400005610, and 792316300013740 (zi).

In the text
thumbnail Fig. B.1

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

In the text
thumbnail Fig. B.2

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

In the text
thumbnail Fig. B.3

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

In the text
thumbnail Fig. B.4

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

In the text
thumbnail Fig. B.5

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

In the text
thumbnail Fig. B.6

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

In the text
thumbnail Fig. B.7

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

In the text
thumbnail Fig. B.8

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

In the text
thumbnail Fig. B.9

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

In the text
thumbnail Fig. B.10

Light curves of SNAD supernova candidates in zr-band and the results of their fit by Nugent’s supernova models.

In the text
thumbnail Fig. C.1

SNAD ZTF viewer tags and logs on the example of SNAD178.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.