Improving the Automated Coronal Jet Identification with U-NET

Jiajia Liu Deep Space Exploration Lab/School of Earth and Space Sciences, University of Science and Technology of China, Hefei, 230026, China CAS Key Laboratory of Geospace Environment/CAS Center for Excellence in Comparative Planetology/Mengcheng
National Geophysical Observatory, University of Science and Technology of China, Hefei, 230026, China
Chunyu Ji CAS Key Laboratory of Geospace Environment/CAS Center for Excellence in Comparative Planetology/Mengcheng
National Geophysical Observatory, University of Science and Technology of China, Hefei, 230026, China
Yimin Wang School of Data Science, Qingdao University of Science and Technology, Qingdao, 266100, China Szabolcs Soós Department of Astronomy, Eötvös Loránd University, Budapest, Pázmány P. sétány 1/A, H-1117, Hungary Gyula Bay Zoltán Solar Observatory (GSO), Hungarian Solar Physics Foundation (HSPF), Petőfi tér 3., Gyula, H-5700, Hungary Ye Jiang School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, 266100, China Robertus Erdélyi Solar Physics and Space Plasma Research Centre (SP2RC), School of Mathematics and Statistics, The University of Sheffield, Sheffield, S3 7RH, UK Department of Astronomy, Eötvös Loránd University, Budapest, Pázmány P. sétány 1/A, H-1117, Hungary Gyula Bay Zoltán Solar Observatory (GSO), Hungarian Solar Physics Foundation (HSPF), Petőfi tér 3., Gyula, H-5700, Hungary M. B. Korsós University of Sheffield, Department of Automatic Control and Systems Engineering, Amy Johnson Building, Portabello Street, Sheffield, S1 3JD, UK Department of Astronomy, Eötvös Loránd University, Budapest, Pázmány P. sétány 1/A, H-1117, Hungary Gyula Bay Zoltán Solar Observatory (GSO), Hungarian Solar Physics Foundation (HSPF), Petőfi tér 3., Gyula, H-5700, Hungary Yuming Wang Deep Space Exploration Lab/School of Earth and Space Sciences, University of Science and Technology of China, Hefei, 230026, China CAS Key Laboratory of Geospace Environment/CAS Center for Excellence in Comparative Planetology/Mengcheng
National Geophysical Observatory, University of Science and Technology of China, Hefei, 230026, China
Abstract

Coronal jets are one of the most common eruptive activities in the solar atmosphere. They are related to rich physics processes, including but not limited to magnetic reconnection, flaring, instabilities, and plasma heating. Automated identification of off-limb coronal jets has been difficult due to their abundant nature, complex appearance, and relatively small size compared to other features in the corona. In this paper, we present an automated coronal jet identification algorithm (AJIA) that utilizes true and fake jets previously detected by a laborious semi-automated jet detection algorithm (SAJIA, Liu et al. 2023) as the input of an image segmentation neural network U-NET. It is found that AJIA could achieve a much higher (0.81) detecting precision than SAJIA (0.34), meanwhile giving the possibility of whether each pixel in an input image belongs to a jet. We demonstrate that with the aid of artificial neural networks, AJIA could enable fast, accurate, and real-time coronal jet identification from SDO/AIA 304 Å observations, which are essential in studying the collective and long-term behavior of coronal jets and their relation with the solar activity cycles.

facilities: SDO (AIA)

1 Introduction

Jets are abundant in the solar atmosphere. A large amount of jets with various scales and temperatures originating from different locations on-disk or off-limb have been observed using modern telescopes since the first observations of Hα𝛼\alphaitalic_α surges (“cold jets”, Newton, 1934) almost one century ago. Based on their different sizes, solar jets are often divided into two categories: small-scale jets and large-scale jets.

Small-scale jets are usually referred to as spicules. Spicules are further sub-divided as the traditional Secchi-type (also called as type-I) and the ones generated by magnetic reconnections (type-II, which is also referred to as rapid blue/red excursions, RBEs/RREs), while both are usually observed in the chromosphere and transition region (e.g., Beckers, 1968; Sterling, 2000; De Pontieu et al., 2007; Sekse et al., 2012). The importance of small-scale jets is well-known as they are suggested to have substantial contributions to coronal heating and solar wind acceleration (e.g., He et al., 2009; Moore et al., 2011; Goodman, 2012; Samanta et al., 2019). The triggering mechanisms of spicules are complicated, which could involve (combined) effects of small-scale magnetic reconnections (e.g., De Pontieu et al., 2007; Samanta et al., 2019), waves (e.g., Heggland et al., 2007; Jess et al., 2009, 2012; Dey et al., 2022), and vortices/Alfvén pulses (e.g., Liu et al., 2019a, b; Oxley et al., 2020; Battaglia et al., 2021; Scalisi et al., 2021b, a).

Large-scale jets have been given different names based on the passbands they are observed in, including white-light jets (e.g., Filippov et al., 2011; Kudriavtseva & Prosovetsky, 2019), Hα𝛼\alphaitalic_α surges (e.g., Brooks et al., 2007; Zhelyazkov et al., 2015), UV/EUV jets (e.g., Liu et al., 2015a; Chen et al., 2017; Liu et al., 2019c; Zhang et al., 2021; Schmieder et al., 2022), and X-ray jets (e.g., Shibata et al., 1992; Cirtain et al., 2007). Although various models have been proposed (e.g., Shibata et al., 1992; Canfield et al., 1996; Moore et al., 2010; Sterling et al., 2015; Pariat et al., 2015), almost all have magnetic reconnections, especially the interchange reconnection between open and closed magnetic field lines, involved as the triggering mechanism of large-scale jets. Besides, they have been widely found to be related to many phenomena at different scales, including rotational motions (e.g. Liu et al., 2014; Raouafi et al., 2016; Shen, 2021), waves/instabilities (e.g., Giannios & Spruit, 2006; Cirtain et al., 2007; Kuridze et al., 2016; Bogdanova et al., 2017; Zhao et al., 2018; Li et al., 2023), blobs (e.g., Zhang & Ji, 2014; Ni et al., 2017; Chen et al., 2022), radio bursts (e.g., Mulay et al., 2016; Hou et al., 2023), “switchbacks” in the solar wind (e.g., Sterling & Moore, 2020; Raouafi et al., 2023), and coronal mass ejections (CMEs, e.g., Shen et al., 2012; Liu et al., 2015b; Zheng et al., 2016; Chen et al., 2021). These have made solar jets one of the most important phenomena that connect small and large scales, lower and higher layers, and flows and waves in the highly magnetized and stratified solar atmosphere.

Owing to their complex observational features and abundant nature, it has been rare to study the statistical and long-term behavior of solar jets using a dataset with a large number of events, although we have now entered an era with a tremendous amount of high-spatial and high-temporal resolution observations of the Sun. Musset et al. (2023) started a citizen science initiative called “Solar Jet Hunter” to utilize human resources worldwide in manually identifying coronal jets observed in the Atmospheric Imaging Assembly (AIA, Lemen et al., 2012) 304 Å passband onboard the Solar Dynamics Observatory (SDO, Pesnell et al., 2012). Although more than 800 coronal jets have been reported, this approach suffers from some shortcomings, including the low efficiency of manual identification and the inconsistency of the criteria between different individuals, where the latter could pose unknown bias when statistical analysis is performed.

Refer to caption
Figure 1: Examples of coronal jets and detection results. Images in panel a) are patches of SDO/AIA 304 Å observations with a size of 96×\times×96 pix2. Panel b) lists the corresponding ground truths where yellow colors denote pixels belonging to jets. Panel c) shows detection results by the semi-automated jet detection algorithm developed by Liu et al. (2023). Panel d) are the jet detection results by the automated jet identification algorithm (AJIA) proposed in this paper.

To facilitate more systematic studies of off-limb coronal jets with less human biases, Liu et al. (2023) developed a semi-automated jet identification algorithm (SAJIA) based on applying traditional computer vision techniques to SDO/AIA 304 Å observations. More than 1200 coronal jets were detected by applying SAJIA to SDO/AIA 304 Å observations obtained from 2010 to 2020. A power-law distribution of the jets’ thermal energy was found to be highly consistent with those of micro-flares, indicating that they should result from the same nonlinear statistics of scale-free processes. This result was also supported by the first coronal jet butterfly diagram, which is usually seen in the migration of sunspots during solar activity cycles. By doubling the number of observations and extending them to the end of 2021, Soós et al. (2024) expanded the dataset to more than 2700 coronal jets and found some intriguing oscillatory behaviors from their spatial-temporal distributions. It is worth noting that many of these detected jets are in polar regions. Previous studies suggest that solar jets at various scales (e.g., Chandrashekhar et al., 2014; Chitta et al., 2023; Uritsky et al., 2023) could contribute to energizing the solar wind. The above dataset would enable further such studies from a statistical perspective.

However, it should be noted that the automated identification part in SAJIA has a relatively low precision (similar-to\sim0.34) and suffers from the CCD degradation of the AIA instrument (Dos Santos et al., 2021; Liu et al., 2023; Soós et al., 2024). A way to address the above issue was to check the identification results manually to eliminate fake jets. The above process was time-consuming and prevented SAJIA from being deployed for real-time jet detection. This paper presents the automated jet identification algorithm (AJIA) with the U-NET neural network (Ronneberger et al., 2015). We demonstrate that the average precision of AJIA is above 0.8, which enables a more accurate coronal jet detection. The paper is organized as follows: the dataset is described in Sect. 2 with the model and training process detailed in Sect. 3. Results are presented in Sect. 4, before the conclusions and discussions in Sect. 5

2 Data

True and fake jets detected by SAJIA are used as the input of the U-NET model to be detailed in the next section. The method of SAJIA (Liu et al., 2023) is briefly recapped as follows:

  • For a given SDO/AIA 304 Å image, a background is constructed using four images obtained on the same day and then subtracted from the given image.

  • The solar disk (with a radius of 1.02 solar radii) of the background-removed image is masked as we only detect off-limb jets.

  • The masked image (4096×\times×4096 pix2) is downgraded to 512×\times×512 pix2 to reduce the computational power needed.

  • The downgraded image is then normalized and binarised with given thresholds.

  • The Douglas-Peucker algorithm (Douglas & Peucker, 1973) is employed to determine the shape of bright features in the image and yield candidate polygons.

  • Polygons with four edges, inclination angles less than 60  and aspect ratios greater than 1.5 are kept as jet candidates.

  • Each candidate is manually checked to determine whether it is a true or fake jet.

By applying the above processes to SDO/AIA 304 Å observations from 2010-06-01 to 2021-12-31 with six images per day at a cadence of 3 hours from 00 UT, 7890 jet candidates were detected (Liu et al., 2023; Soós et al., 2024). Among all the candidates, 2704 are found to be true jets, and 5186 are fake ones, resulting in a precision of 2704/78900.34270478900.342704/7890\approx 0.342704 / 7890 ≈ 0.34. Initially, full-disk images that contain the above jet candidates were used to build the input of the U-NET model. However, it resulted in poor performance, with the model generating all off-limb bright features but not focusing on jets. This result was unsurprising as jets are relatively small in the full-disk observations, and other bright features would have introduced many distractions to the neural network model.

Refer to caption
Figure 2: Architecture of U-NET. This cartoon is adopted from Ronneberger et al. (2015). See Sect. 3 for a detailed description of the U-NET architecture.

Considering that the largest detected jet has a length of approximately 70 pixels, small patches of 96×\times×96 pix2 centered at each jet candidate are then extracted from the masked observations (see the first three steps in SAJIA described above). This particular size of patches could minimize the appearance of non-jet features in the images while ensuring that one single jet would not be cut into several patches. These patches are then normalized to [0, 1] with a threshold of 5, which was determined via trial and error. Some examples of these patches are shown in Figure 1 a), where the first three rows are true jets, and the last two contain fake jets. Patches with the same size are generated to serve as the ground truth (labels) of the neural network, with pixels covered by true jets set to 1 and all other pixels set to 0. Figure 1 b) are the ground truth of the corresponding observations in panel a). SAJIA detections are depicted in panel c).

The resulting dataset contains 2704 (5186) pairs of image and label patches of true (fake) jets. These true jets have projected lengths in the plane of the sky from more than 10 Mm to about 330 Mm. All jets are divided into two parts: 80% into the train set and 20% into the validation set. However, the dataset is imbalanced as there are 91.7% more fake jets than true jets. To solve this problem, new images and labels are generated by randomly flipping and rotating (between ±0.4πplus-or-minus0.4𝜋\pm 0.4\pi± 0.4 italic_π, big enough while less than 0.5π0.5𝜋0.5\pi0.5 italic_π above which many parts of the images would be cropped) the original images and labels of true jets. The above data augmentation is performed separately in the train and test sets to avoid possible data leakage. After the data augmentation, the dataset is balanced with 5186 fake jets and 5408 true jets. The final dataset has 8474 (2120) pairs of images and labels in the train (test) set.

3 Model and Training

Refer to caption
Figure 3: Model losses under different learning rates. The upper (lower) panel shows the logarithm of the training (validation) losses for different combinations of learning rates and decaying steps. A decaying step of Inf𝐼𝑛𝑓Infitalic_I italic_n italic_f means the learning rate is fixed without any decaying.

The U-NET convolutional neural network architecture (see Fig. 2) was initially proposed by Ronneberger et al. (2015) for biomedical image segmentation purposes. It was applied to transmitted light microscopy images and won the ISBI cell tracking challenge 2015 (Ronneberger et al., 2015). U-NET was then modified and successfully applied for different kinds of image segmentation purposes, including 3D image segmentation and road segmentation (e.g., Minaee et al., 2021).

U-NET contains several convolutional layers with different filter sizes. In the first step, the input image with a size of 96×96×19696196\times 96\times 196 × 96 × 1 (yellow block in Fig 2) is taken into two convolutional layers, each having 64 filters with a kernel size of 3×3333\times 33 × 3. In each layer, the Rectified Linear Unit (ReLU) activation function is used after each convolutional operation, where ReLU(x)=max(0,x)𝑅𝑒𝐿𝑈𝑥𝑚𝑎𝑥0𝑥ReLU(x)=max(0,x)italic_R italic_e italic_L italic_U ( italic_x ) = italic_m italic_a italic_x ( 0 , italic_x ). The resulting image after the first step has a size of 96×96×6496966496\times 96\times 6496 × 96 × 64. This image is then down-sampled to 48×48×6448486448\times 48\times 6448 × 48 × 64 by a max pooling operation (purple arrows in Fig. 2), where only the maximum value in every 2×2222\times 22 × 2 region in the image is kept, and all other pixels are discarded. The image is then taken into the next step, which contains two convolutional layers but doubles the number of filters (128).

The above process is repeated until the image size is down-sampled to 6×6666\times 66 × 6 but with 1024 filters. Then, a reverse series of operations of the above process is performed to up-sample (yellow arrows in Fig. 2) the image until it again has a size of 96×96×6496966496\times 96\times 6496 × 96 × 64. Two extra convolutional layers with 2 and 1 filters are used to generate the final output image (blue block in Fig. 2). This unique convolutional neural network is named “U-NET” as its architecture resembles the letter U (Fig. 2). The left (right) part of U-NET is usually called the encoder (decoder). Layers in the encoder are skip-connected with layers in the decoder (grey arrows in Fig. 2). These skip connections remind U-NET of the fine details learned in the encoder that could be used to construct images in the decoder. It has been found particularly effective and successful in image segmentation as its contracting path (down-sampling) can capture the context of an image, and its symmetric expanding path (up-sampling) can enable precise localization (Ronneberger et al., 2015). The loss function of U-NET is set to be the binary cross-entropy loss, which is defined as follows:

log(L)=1Ni=1N[yilog(yi^)+(1yi)log(1yi^)],𝑙𝑜𝑔𝐿1𝑁superscriptsubscript𝑖1𝑁delimited-[]subscript𝑦𝑖𝑙𝑜𝑔^subscript𝑦𝑖1subscript𝑦𝑖𝑙𝑜𝑔1^subscript𝑦𝑖\centering log(L)=\frac{1}{N}\sum_{i=1}^{N}\left[y_{i}log(\hat{y_{i}})+(1-y_{i% })log(1-\hat{y_{i}})\right],\@add@centeringitalic_l italic_o italic_g ( italic_L ) = divide start_ARG 1 end_ARG start_ARG italic_N end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT [ italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_l italic_o italic_g ( over^ start_ARG italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) + ( 1 - italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) italic_l italic_o italic_g ( 1 - over^ start_ARG italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) ] , (1)

where, L𝐿Litalic_L is the loss. N𝑁Nitalic_N is the number of pixels in each image. yisubscript𝑦𝑖y_{i}italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the value (0 or 1) of each pixel in the label, and yi^^subscript𝑦𝑖\hat{y_{i}}over^ start_ARG italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG is the corresponding prediction value (0 to 1).

Refer to caption
Figure 4: Training history with a fixed learning rate of 104superscript10410^{-4}10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT. The blue (orange) curve is the evolution of the training (validation) loss.

All 8474 pairs of images and labels in the train set were taken into the above U-NET neural network to train a jet identification model. Considering the capacity of the GPU (Nvidia GeForce GTX 4090 with a RAM of 24 GB), the batch size was set to be 256. Another vital hyper-parameter during training is the learning rate. The learning rate determines how much the model weights are updated in response to the loss of each batch. A too-big learning rate will result in the model skipping the minimum of the loss function and making it hard to converge, while a too-small learning rate will probably trap the model in a local minimum of the loss function. A common practice is to set a relatively large learning rate at the beginning of the training and decrease it along a given function at certain steps. In the case of our model, a cosine decay function is used to avoid the learning rate decreasing too fast (see, e.g., Loshchilov & Hutter, 2016, for more details).

Figure 3 depicts the losses obtained in the train (upper panel) and validation sets (lower panel) with different initial learning rates and decaying steps. An “infinite” decaying step represents a fixed learning rate without decay during training. It can be seen from Figure 3 that the minimum losses (in the order of 103similar-toabsentsuperscript103\sim 10^{-3}∼ 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT) in both of the train and validation sets can be achieved when the learning rate is fixed at 104superscript10410^{-4}10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT.

4 Results

4.1 Model selection

The blue curve in Figure 4 is the evolution of the training loss, with a fixed learning rate of 104superscript10410^{-4}10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT. It is seen that the training loss decreases as the number of training epochs grows and reaches its minimum of 1.2×1031.2superscript1031.2\times 10^{-3}1.2 × 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT at 117 epochs. At 117 epochs, the validation loss (orange curve in Fig. 4 a) is 5.5×1035.5superscript1035.5\times 10^{-3}5.5 × 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT, which is also around its minimum. Two other commonly used parameters to measure the performance of image segmentation tasks are the mean average precision (mAP) and the mean intersection-over-union (mIoU). Here, we report that, at 117 epochs, the trained model has an mAP of 0.87 and an mIoU of 0.50 for the training set, an mAP of 0.58, and an mIoU of 0.50 for the validation set.

Refer to caption
Figure 5: Confusion matrix of the trained U-NET model. Panel a) shows how the recall (blue curve), TNR (orange curve), precision (green curve), and F-measure (red curve) changes with different thresholds. See Equation 1 for the definitions of the above measurements. Panel b) is the distribution of true and fake jets in the input labels and predictions made by AJIA.

Based on the above observations, we use the model trained at 117 epochs as the final coronal jet identification model. This model could take the 96×96969696\times 9696 × 96 patches of the SDO/AIA 304 Å observations as its input and automatically identify coronal jets (thus named Automated Jet Identification Algorithm - AJIA). Images in Figure 1 d) are the predictions by AJIA based on the inputs in panel a). Colors in the images denote the possibility of the corresponding pixels belonging to jets. A pixel with a value of 1 (0) means that AJIA thinks there is a 100% (0%) chance that this pixel belongs to a jet. In the first three rows, where true jets are present in the input images, AJIA could successfully identify almost identical jets to the ground truths. In the last two rows, where there are no jets but our traditional jet identification algorithm SAJIA (Liu et al., 2023) wrongly detects jets, AJIA successfully avoids making the same mistakes.

4.2 Model evaluation

To evaluate the performance of AJIA, a threshold needs to be defined - only above which can the detected feature by AJIA be considered as a jet. For example, if we have a threshold of T𝑇Titalic_T and a detected feature by AJIA can be considered as a jet (true/positive) only if the maximum predicted value of the feature by AJIA is no less than T𝑇Titalic_T. Otherwise, it is considered as non-jet (fake/negative). Panel a) in Figure 5 shows how the recall, the true negative Rate (TNR), the precision, and the F-measure evolve with different thresholds T𝑇Titalic_T. Recall, TNR, precision, and F-measure are defined as follows:

Recall=TPTP+FN,TNR=TNTN+FP,Precision=TPTP+FP,FMeasure=2PrecisionRecallPrecision+Recall,formulae-sequence𝑅𝑒𝑐𝑎𝑙𝑙𝑇𝑃𝑇𝑃𝐹𝑁formulae-sequence𝑇𝑁𝑅𝑇𝑁𝑇𝑁𝐹𝑃formulae-sequence𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛𝑇𝑃𝑇𝑃𝐹𝑃𝐹𝑀𝑒𝑎𝑠𝑢𝑟𝑒2𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛𝑅𝑒𝑐𝑎𝑙𝑙𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛𝑅𝑒𝑐𝑎𝑙𝑙\begin{split}Recall=\frac{TP}{TP+FN},\\ TNR=\frac{TN}{TN+FP},\\ Precision=\frac{TP}{TP+FP},\\ F-Measure=\frac{2*Precision*Recall}{Precision+Recall},\end{split}start_ROW start_CELL italic_R italic_e italic_c italic_a italic_l italic_l = divide start_ARG italic_T italic_P end_ARG start_ARG italic_T italic_P + italic_F italic_N end_ARG , end_CELL end_ROW start_ROW start_CELL italic_T italic_N italic_R = divide start_ARG italic_T italic_N end_ARG start_ARG italic_T italic_N + italic_F italic_P end_ARG , end_CELL end_ROW start_ROW start_CELL italic_P italic_r italic_e italic_c italic_i italic_s italic_i italic_o italic_n = divide start_ARG italic_T italic_P end_ARG start_ARG italic_T italic_P + italic_F italic_P end_ARG , end_CELL end_ROW start_ROW start_CELL italic_F - italic_M italic_e italic_a italic_s italic_u italic_r italic_e = divide start_ARG 2 ∗ italic_P italic_r italic_e italic_c italic_i italic_s italic_i italic_o italic_n ∗ italic_R italic_e italic_c italic_a italic_l italic_l end_ARG start_ARG italic_P italic_r italic_e italic_c italic_i italic_s italic_i italic_o italic_n + italic_R italic_e italic_c italic_a italic_l italic_l end_ARG , end_CELL end_ROW (2)

where TP𝑇𝑃TPitalic_T italic_P is true positive (AJIA successfully detects the jet in the label), FN𝐹𝑁FNitalic_F italic_N is false negative (AJIA misses the jet in the label), TN𝑇𝑁TNitalic_T italic_N is true negative (there is no jet in the label, and AJIA also does not detect any jet), and FP𝐹𝑃FPitalic_F italic_P is false positive (there is no jet in the label but AJIA detects a jet). One can see from the above definitions that recall represents the percentage of true jets in the labels that are successfully detected by AJIA. TNR denotes the percentage of fake jets in the labels that are also considered fake jets by AJIA. Precision is the percentage of true jets in all jets detected by AJIA. F-measure measures the combined effect of precision and recall.

We can see from Figure 5 a) that both TNR and precision increase as the threshold grows. However, the recall decreases with the threshold. When the threshold is around 0.76, the F-measure peaks at similar-to\sim0.81, and all three other measurements converge at similar values. This suggests that a threshold of 0.76 would give the most ideal and balanced performance. Figure 5 b) is a comparison between the ground truths and the predictions of AJIA. Among 1040 candidates identified by AJIA, 835 (205) are true (fake) jets, yielding a precision of similar-to\sim80.3%.

4.3 Application to higher-cadence data

The current dataset used for building the model was generated by Liu et al. (2023) and Soós et al. (2024) and contains coronal jets detected with a cadence of 3 hours. This low cadence, together with the relatively low precision of SAJIA, results in a missing rate of 30similar-toabsent30\sim 30∼ 30% in non-polar regions (see estimations in Liu et al., 2023) and prevents us from further studying the temporal evolution of the detected jets. The high performance of AJIA indicated by its high recall, TNR, precision, and F-measure, as detailed in the previous subsection, provides an excellent opportunity to look into the above issue via automatically detecting jets with high accuracy at higher cadences.

To test the application of AJIA to higher-cadence data, we employed SAJIA to detect coronal jets at 1-hour intervals from 00:30 UT every day throughout January 2011. SAJIA yields 409 jet candidates, compared to 68 jet candidates given by SAJIA with a 3-hour cadence in January 2011. After laborious identification of these jet candidates by downloading and checking their temporal evolution one by one, 235 are identified as true, and the other 174 are fake. Among all fake jets, 94%similar-toabsentpercent94\sim 94\%∼ 94 % are (part of) prominences, CMEs, or coronal rains. This gives a precision of SAJIA of 51%similar-toabsentpercent51\sim 51\%∼ 51 %, consistent with the findings in Liu et al. (2023) and Soós et al. (2024) (also see discussions in Sect. 5).

These jets were not included in the previous dataset employed to build AJIA and could be used to test the application of AJIA to unknown events. Figure 6 depicts the confusion matrix of AJIA’s prediction on the above 235 true and 174 fake jets. TP, TN, FP, and FP are 213, 148, 26, and 22, respectively. This indicates that the precision of AJIA in detecting these unknown events is about 0.81. The recall, TNR, and F-measure are 0.81, 0.80, and 0.81, respectively. These values are consistent with what was found in the validation set as described in the previous subsections and further suggest AJIA’s potential to detect off-limb coronal jets accurately.

Refer to caption
Figure 6: Confusion matrix of AJIA with 1-hour cadence data. Similar to panel b) in Fig. 5, this figure shows the distribution of true and fake jets in the input labels and predictions made by AJIA from the 235 true and 174 fake jets detected in January 2011.
Refer to caption
Figure 7: Precision of AJIA and SAJIA for jets in different years. Colored dots and the solid curve are the yearly precisions of AJIA, with colors denoting the normalized (to the red number 375) number of events. Colored diamonds and the dashed curve are the yearly precisions of SAJIA, with colors denoting the normalized (to the red number 787) number of events, using data from Soós et al. (2024).

5 Conclusions and Discussions

In this paper, we presented the development of the Automated Jet Identification Algorithm (AJIA), which is built based on off-limb coronal jets detected by our previously developed semi-automated jet identification algorithm (SAJIA, Liu et al., 2023). These jets were fed into a U-NET (Ronneberger et al., 2015) neural network to train the final model. Evaluating AJIA on a test set containing 2120 true and fake jets yields a precision, recall, TNR, and F-measure of around 0.81, where the precision is significantly larger than that of SAJIA (0.34).

It was found in Soós et al. (2024) that the precision of SAJIA is heavily impacted by the CCD degradation of SDO/AIA. Diamonds connected by dashed lines in Figure 7 are the precisions of SAJIA measured each year (inferred from Table 1 in Soós et al., 2024), with colors denoting the normalized number of events. In general, the precision of SAJIA undergoes an overall decreasing trend, which is consistent with the overall decreasing sensitivity of the SDO/AIA detectors (e.g., Dos Santos et al., 2021). We demonstrate that AJIA is not affected by the same effect, although it was trained using SDO/AIA 304 Å images before being corrected for CCD degradation. Dots connected by solid lines in Figure 7 are the precisions of AJIA measured each year, where colors are the normalized number of events. It can be seen that the performance of AJIA does not decrease with time, and its minimum value (>>>0.55 in 2018) is above the maximum precision of SAJIA (similar-to\sim0.54 in 2010).

To conclude, AJIA is a step forward compared to SAJIA as it is more precise (with a precision of 0.81) and not affected by CCD degradation. AJIA is also fast, and it takes less than 6 seconds to make predictions for all 2120 images in the test set (2.6 ms per image). Another advantage of AJIA should be noted - it gives the “possibility” of whether a pixel in the observation belongs to a jet or not (see panel c in Fig. 1). This enables us to generate jet heatmaps directly from SDO/AIA 304 Å observations and allows real-time jet detection and visualization. These advantages of AJIA are essential in enabling many pieces of research, including but not limited to studying the collective behavior of coronal jets over the long term, their evolution over the solar activity cycles, and their relation with other solar phenomena (see e.g., Liu et al., 2023; Soós et al., 2024).

Future work will also focus on improving AJIA’s detection precision, which might be achieved by adding several fully connected layers after U-NET instead of giving a fixed threshold, as was done in this work. Another future work will utilize the improved model to detect more jets at a much higher cadence (i.e., 1 hour or less) than 3 hours to explore the temporal aspects of coronal jets, especially given the capability of AJIA in accurately detecting jets from such data as described in Sect. 4.3. This will also enable the study of their velocities by developing a dedicated automated algorithm using, including but not limited to, the surfing transform technique (Uritsky et al., 2023), the Gaussian fitting method (Chitta et al., 2023), and the optical flow estimation (Fleet & Weiss, 2006). Their kinetic energy would further be estimated, and the existence of a power-law distribution, which is essential in understanding the fundamental physics of the release of free magnetic energy in the solar atmosphere, would then be examined following Liu et al. (2023); Uritsky et al. (2023).

This will further enable a series of statistical studies that were not done before due to the relatively small number of events detected. For example, Liu et al. (2023) found the “butterfly diagram” of coronal jets where the average latitudes of jets migrate from mid-latitudes to the equator from the beginning to the end of the solar activity cycle. It is well known that magnetic elements in high latitudes also migrate toward the polar regions throughout the solar cycle. However, this trend was not seen in Liu et al. (2023), and whether its absence is caused by the limited number of events or the possibly different triggering mechanisms between jets originating from active regions and non-active regions is yet to be examined by building a larger dataset with more events.

Via having more samples of off-limb coronal jets, the distributions and differences of active-region, quiet-region, and polar jets could be further studied. Their different behaviors during solar activity could also be evaluated. The future large dataset would also enable statistical studies on how coronal jets could gain their kinetic energy (e.g., Liu et al., 2014), how many twists they release (e.g., Liu et al., 2019c), and how the magnetic energy is distributed to different forms of energies during their eruption (e.g., Liu et al., 2016). Moreover, preliminary evidence of the so-called solar active longitude (e.g., Gyenge et al., 2017) was given in both Liu et al. (2023) and Soós et al. (2024), but more evidence could be supplied by studying a significantly larger number of jets. Finding and validating active longitudes from these small-scale events will be significant for the theory and simulation of the solar dynamo.

Acknowledgements

We acknowledge the use of the data from the Solar Dynamics Observatory (SDO). SDO is the first mission of NASA’s Living With a Star (LWS) program. The SDO/AIA data are publicly available from NASA’s SDO website (https://sdo.gsfc.nasa.gov/data/). This research is supported by the Strategic Priority Research Program of the Chinese Academy of Science (Grant No. XDB0560000), National Key Technologies Research, Development Program of the Ministry of Science and Technology of China (2022YFF0711402), the Informatization Plan of the Chinese Academy of Sciences (CAS-WX2022SF-0103), and the National Natural Science Foundation (NSFC 12373056, 42188101). Yimin Wang and Ye Jiang acknowledge the support from the National Natural Science Foundation (NSFC 12303103) and the Natural Science Foundation of Shandong Province (ZR2023QF151). R. Erdélyi is grateful to STFC (UK, grant number ST/M000826/1) and PIFI (China, grant number No. 2024PVA0043). M.B. Korsós acknowledges support by the Leverhulme Trust Found ECF-2023-271 and UNKP-23-4-II-ELTE-107, ELTE Hungary. R. Erdélyi and M.B. Korsós also thank for the support received from NKFIH OTKA (Hungary, grant No. K142987). Sz.S. acknowledges the support (grant No. C1791784) provided by the Ministry of Culture and Innovation of Hungary of the National Research, Development and Innovation Fund, financed under the KDP-2021 funding scheme.

References

  • Battaglia et al. (2021) Battaglia, A. F., Canivete Cuissa, J. R., Calvo, F., Bossart, A. A., & Steiner, O. 2021, A&A, 649, A121, doi: 10.1051/0004-6361/202040110
  • Beckers (1968) Beckers, J. M. 1968, Sol. Phys., 3, 367, doi: 10.1007/BF00171614
  • Bogdanova et al. (2017) Bogdanova, M., Zhelyazkov, I., Joshi, R., & Chandra, R. 2017. https://arxiv.org/abs/1711.10734
  • Brooks et al. (2007) Brooks, D. H., Kurokawa, H., & Berger, T. E. 2007, ApJ, 656, 1197, doi: 10.1086/510144
  • Canfield et al. (1996) Canfield, R. C., Reardon, K. P., Leka, K. D., et al. 1996, ApJ, 464, 1016, doi: 10.1086/177389
  • Chandrashekhar et al. (2014) Chandrashekhar, K., Bemporad, A., Banerjee, D., Gupta, G. R., & Teriaca, L. 2014, A&A, 561, A104, doi: 10.1051/0004-6361/201321213
  • Chen et al. (2021) Chen, H., Yang, J., Hong, J., Li, H., & Duan, Y. 2021, ApJ, 911, 33, doi: 10.3847/1538-4357/abe6a8
  • Chen et al. (2017) Chen, J., Su, J., Deng, Y., & Priest, E. R. 2017, ApJ, 840, 54, doi: 10.3847/1538-4357/aa6c59
  • Chen et al. (2022) Chen, J., Erdélyi, R., Liu, J., et al. 2022, Front. Astron. Sp. Sci., 8, 1, doi: 10.3389/fspas.2021.786856
  • Chitta et al. (2023) Chitta, L. P., Zhukov, A. N., Berghmans, D., et al. 2023, Science, 381, 867, doi: 10.1126/science.ade5801
  • Cirtain et al. (2007) Cirtain, J. W., Golub, L., Lundquist, L., et al. 2007, Science (80-. )., 318, 1580, doi: 10.1126/science.1147050
  • De Pontieu et al. (2007) De Pontieu, B., McIntosh, S., Hansteen, V. H., et al. 2007, PASJ, 59, 655
  • Dey et al. (2022) Dey, S., Chatterjee, P., O. V. S. N., M., et al. 2022, Nat. Phys., 18, 595, doi: 10.1038/s41567-022-01522-1
  • Dos Santos et al. (2021) Dos Santos, L. F. G., Bose, S., Salvatelli, V., et al. 2021, A&A, 648, A53, doi: 10.1051/0004-6361/202040051
  • Douglas & Peucker (1973) Douglas, D. H., & Peucker, T. K. 1973, Cartographica: the international journal for geographic information and geovisualization, 10, 112
  • Filippov et al. (2011) Filippov, B., Koutchmy, S. L., & Tavabi, E. 2011, Sol. Phys., 286, 143. https://api.semanticscholar.org/CorpusID:119272991
  • Fleet & Weiss (2006) Fleet, D., & Weiss, Y. 2006, Optical Flow Estimation, ed. N. Paragios, Y. Chen, & O. Faugeras (Boston, MA: Springer US), 237–257, doi: 10.1007/0-387-28831-7_15
  • Giannios & Spruit (2006) Giannios, D., & Spruit, H. C. 2006, Astron. Astrophys., 450, 887, doi: 10.1051/0004-6361:20054107
  • Goodman (2012) Goodman, M. L. 2012, Astrophys. J., 757, doi: 10.1088/0004-637X/757/2/188
  • Gyenge et al. (2017) Gyenge, N., Singh, T., Kiss, T. S., Srivastava, A. K., & Erdélyi, R. 2017, ApJ, 838, 18
  • He et al. (2009) He, J., Marsch, E., Tu, C., & Tian, H. 2009, Astrophys. J., 705, L217, doi: 10.1088/0004-637X/705/2/L217
  • Heggland et al. (2007) Heggland, L., De Pontieu, B., & Hansteen, V. H. 2007, ApJ, 666, 1277
  • Hou et al. (2023) Hou, Z., Tian, H., Su, W., et al. 2023, ApJ, 953, 171, doi: 10.3847/1538-4357/ace31b
  • Jess et al. (2009) Jess, D. B., Mathioudakis, M., Erdélyi, R., et al. 2009, Science, 323, 1582. https://arxiv.org/abs/0903.3546
  • Jess et al. (2012) Jess, D. B., Pascoe, D. J., Christian, D. J., et al. 2012, Astrophys. J., 744, L5, doi: 10.1088/2041-8205/744/1/L5
  • Kudriavtseva & Prosovetsky (2019) Kudriavtseva, A. V., & Prosovetsky, D. V. 2019, J. Atmos. Solar-Terrestrial Phys., 193, 105039, doi: https://doi.org/10.1016/j.jastp.2019.05.003
  • Kuridze et al. (2016) Kuridze, D., Zaqarashvili, T. V., Henriques, V., et al. 2016, Astrophys. J., 830, 133, doi: 10.3847/0004-637X/830/2/133
  • Lemen et al. (2012) Lemen, J. R., Title, A. M., Akin, D. J., et al. 2012, Sol. Phys., 275, 17, doi: 10.1007/s11207-011-9776-8
  • Li et al. (2023) Li, X., Keppens, R., & Zhou, Y. 2023, ApJ, 947, L17, doi: 10.3847/2041-8213/acc9ba
  • Liu et al. (2015a) Liu, J., Fang, F., Wang, Y., et al. 2015a, in AGU Fall Meeting Abstracts, Vol. 2015, SH31B–2405
  • Liu et al. (2019a) Liu, J., Nelson, C. J., & Erdélyi, R. 2019a, ApJ, 872, 22, doi: 10.3847/1538-4357/aabd34
  • Liu et al. (2019b) Liu, J., Nelson, C. J., Snow, B., Wang, Y., & Erdélyi, R. 2019b, Nature Communications, 10, 3504, doi: 10.1038/s41467-019-11495-0
  • Liu et al. (2019c) Liu, J., Wang, Y., & Erdélyi, R. 2019c, Frontiers in Astronomy and Space Sciences, 6, 44, doi: 10.3389/fspas.2019.00044
  • Liu et al. (2014) Liu, J., Wang, Y., Liu, R., et al. 2014, ApJ, 782, 94, doi: 10.1088/0004-637X/782/2/94
  • Liu et al. (2015b) Liu, J., Wang, Y., Shen, C., et al. 2015b, ApJ, 813, 115, doi: 10.1088/0004-637X/813/2/115
  • Liu et al. (2016) Liu, J., Wang, Y., Erdélyi, R., et al. 2016, ApJ, 833, 150, doi: 10.3847/1538-4357/833/2/150
  • Liu et al. (2023) Liu, J., Song, A., Jess, D. B., et al. 2023, ApJS, 266, 17, doi: 10.3847/1538-4365/acc85a
  • Loshchilov & Hutter (2016) Loshchilov, I., & Hutter, F. 2016, arXiv e-prints, arXiv:1608.03983, doi: 10.48550/arXiv.1608.03983
  • Minaee et al. (2021) Minaee, S., Boykov, Y. Y., Porikli, F., et al. 2021, IEEE Trans. Pattern Anal. Mach. Intell., 1, doi: 10.1109/TPAMI.2021.3059968
  • Moore et al. (2010) Moore, R. L., Cirtain, J. W., Sterling, A. C., & Falconer, D. A. 2010, ApJ, 720, 757, doi: 10.1088/0004-637X/720/1/757
  • Moore et al. (2011) Moore, R. L., Sterling, A. C., Cirtain, J. W., & Falconer, D. A. 2011, Astrophys. J., 731, L18, doi: 10.1088/2041-8205/731/1/L18
  • Mulay et al. (2016) Mulay, S. M., Tripathi, D., Zanna, G. D., & Mason, H. 2016, Astron. Astrophys., 589, A79, doi: 10.1051/0004-6361/201527473
  • Musset et al. (2023) Musset, S., Jol, P., Sankar, R., et al. 2023, 1. https://arxiv.org/abs/2309.14871
  • Newton (1934) Newton, H. W. 1934, MNRAS, 94, 472
  • Ni et al. (2017) Ni, L., Zhang, Q.-M., Murphy, N. A., & Lin, J. 2017, arXiv Prepr., 841, 1, doi: 10.3847/1538-4357/aa6ffe
  • Oxley et al. (2020) Oxley, W., Scalisi, J., Ruderman, M. S., & Erdélyi, R. 2020, ApJ, 905, 168, doi: 10.3847/1538-4357/abcafe
  • Pariat et al. (2015) Pariat, E., Dalmasse, K., DeVore, C. R., Antiochos, S. K., & Karpen, J. T. 2015, Astron. Astrophys., 573, 539, doi: 10.1051/0004-6361/201424209
  • Pesnell et al. (2012) Pesnell, W. D., Thompson, B. J., & Chamberlin, P. C. 2012, Sol. Phys., 275, 3, doi: 10.1007/s11207-011-9841-3
  • Raouafi et al. (2016) Raouafi, N. E., Patsourakos, S., Pariat, E., et al. 2016, Space Sci. Rev., 201, 1, doi: 10.1007/s11214-016-0260-5
  • Raouafi et al. (2023) Raouafi, N. E., Stenborg, G., Seaton, D. B., et al. 2023, ApJ, 945, 28, doi: 10.3847/1538-4357/acaf6c
  • Ronneberger et al. (2015) Ronneberger, O., Fischer, P., & Brox, T. 2015, arXiv e-prints, arXiv:1505.04597, doi: 10.48550/arXiv.1505.04597
  • Samanta et al. (2019) Samanta, T., Tian, H., Yurchyshyn, V., et al. 2019, Science, 366, 890
  • Scalisi et al. (2021a) Scalisi, J., Oxley, W., Ruderman, M. S., & Erdélyi, R. 2021a, ApJ, 911, 39, doi: 10.3847/1538-4357/abe8db
  • Scalisi et al. (2021b) Scalisi, J., Ruderman, M. S., & Erdélyi, R. 2021b, ApJ, 922, 118, doi: 10.3847/1538-4357/ac2509
  • Schmieder et al. (2022) Schmieder, B., Joshi, R., & Chandra, R. 2022, Adv. Sp. Res., 70, 1580, doi: 10.1016/j.asr.2021.12.013
  • Sekse et al. (2012) Sekse, D. H., van der Voort, L. R., & Pontieu, B. D. 2012, Astrophys. J., 752, 108, doi: 10.1088/0004-637X/752/2/108
  • Shen (2021) Shen, Y. 2021, Proceedings of the Royal Society of London Series A, 477, 217, doi: 10.1098/rspa.2020.0217
  • Shen et al. (2012) Shen, Y., Liu, Y., Su, J., & Deng, Y. 2012, Astrophys. J., 745, 164, doi: 10.1088/0004-637X/745/2/164
  • Shibata et al. (1992) Shibata, K., Ishido, Y., Acton, L. W., et al. 1992, PASJ, 44, L173
  • Soós et al. (2024) Soós, S., Liu, J., Korsós, M. B., & Erdélyi, R. 2024, ApJ, 965, 43, doi: 10.3847/1538-4357/ad29f8
  • Sterling (2000) Sterling, A. C. 2000, Sol. Phys., 196, 79
  • Sterling & Moore (2020) Sterling, A. C., & Moore, R. L. 2020, Astrophys. J., 896, L18, doi: 10.3847/2041-8213/ab96be
  • Sterling et al. (2015) Sterling, A. C., Moore, R. L., Falconer, D. A., & Adams, M. 2015, Nature, 523, 437, doi: 10.1038/nature14556
  • Uritsky et al. (2023) Uritsky, V. M., Karpen, J. T., Raouafi, N. E., et al. 2023, ApJ, 955, L38, doi: 10.3847/2041-8213/acf85c
  • Zhang et al. (2021) Zhang, Q. M., Huang, Z. H., Hou, Y. J., et al. 2021, A&A, 647, A113, doi: 10.1051/0004-6361/202038924
  • Zhang & Ji (2014) Zhang, Q. M., & Ji, H. S. 2014, A&A, 567, A11, doi: 10.1051/0004-6361/201423698
  • Zhao et al. (2018) Zhao, T., Ni, L., Lin, J., & Ziegler, U. 2018. https://arxiv.org/abs/1801.09511
  • Zhelyazkov et al. (2015) Zhelyazkov, I., Zaqarashvili, T. V., Chandra, R., Srivastava, A. K., & Mishonov, T. 2015, Advances in Space Research, 56, 2727, doi: 10.1016/j.asr.2015.05.003
  • Zheng et al. (2016) Zheng, R., Chen, Y., Du, G., & Li, C. 2016, Astrophys. J., 819, L18, doi: 10.3847/2041-8205/819/2/L18