Mendeley

Home

All issues

Volume 674 (June 2023)

A&A, 674 (2023) A107

Full HTML

Open Access

Issue		A&A Volume 674, June 2023


Article Number		A107
Number of page(s)		10
Section		Astronomical instrumentation
DOI		https://doi.org/10.1051/0004-6361/202346302
Published online		09 June 2023

A&A 674, A107 (2023)

Joint machine learning and analytic track reconstruction for X-ray polarimetry with gas pixel detectors

N. Cibrario¹^,2, M. Negro³^,4^,5, N. Moriakov⁶, R. Bonino¹^,2, L. Baldini⁷^,8, N. Di Lalla⁹, L. Latronico¹, S. Maldera¹, A. Manfreda⁷^,10, N. Omodei⁹, C. Sgró⁷ and S. Tugliani¹^,2

¹ Istituto Nazionale di Fisica Nucleare, Sezione di Torino, Via Pietro Giuria 1, 10125 Torino, Italy
e-mail: nicolo.cibrario@unito.it
² Dipartimento di Fisica, Università degli Studi di Torino, Via Pietro Giuria 1, 10125 Torino, Italy
³ University of Maryland, Baltimore County, Baltimore, MD 21250, USA
⁴ NASA Goddard Space Flight Center, Greenbelt, 8800 Greenbelt Rd, MD 20771, USA
⁵ Center for Research and Exploration in Space Science and Technology, NASA/GSFC, Greenbelt, 8800 Greenbelt Rd, MD 20771, USA
⁶ Department of Radiation Oncology, Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, The Netherlands
⁷ Istituto Nazionale di Fisica Nucleare, Sezione di Pisa, Largo B. Pontecorvo 3, 56127 Pisa, Italy
⁸ Dipartimento di Fisica, Università di Pisa, Largo B. Pontecorvo 3, 56127 Pisa, Italy
⁹ Department of Physics and Kavli Institute for Particle Astrophysics and Cosmology, Stanford University, 450 Serra Mall, Stanford, CA 94305, USA
¹⁰ Istituto Nazionale di Fisica Nucleare, Sezione di Napoli, Strada Comunale Cinthia, 80126 Napoli, Italy

Received: 2 March 2023
Accepted: 18 April 2023

Abstract

We present our study on the reconstruction of photoelectron tracks in gas pixel detectors used for astrophysical X-ray polarimetry. Our work aims to maximize the performance of convolutional neural networks (CNNs) to predict the impact point of incoming X-rays from the image of the photoelectron track. A very high precision in the reconstruction of the impact point position is achieved thanks to the introduction of an artificial sharpening process of the images. We find that providing the CNN-predicted impact point as input to the state-of-the-art analytic analysis improves the modulation factor (~1% at 3 keV and ~6% at 6 keV) and naturally mitigates a subtle effect appearing in polarization measurements of bright extended sources known as “polarization leakage”.

Key words: X-rays: general / instrumentation: polarimeters

© The Authors 2023

Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.

1 Introduction

Linear X-ray polarization is of interests to a wide range of fields in physics: from crystal dynamical diffraction (Batterman & Cole 1964), to polarization radiography to supplement mam-mographic images (Maidment 2006), and to high-energy astro physics. The latter application, in particular, is the focus of this work, but the outcome of this study can be easily extended to other applications based on the same detection technique.

The first astrophysical X-ray polarization measurements date back to the 1970s with the observation of the Crab Nebula (Novick et al. 1972; Weisskopf et al. 1978). Such polarization measurements were based on Bragg reflection (Schnopper & Kalata 1969) at 45 degrees of the X-ray on a crystal, exploiting the Bragg diffraction dependence on the radiation polarization (only the X-rays polarized perpendicularly to the plane of incidence are reflected).

X-ray polarimetry can be done significantly more efficiently exploiting the high dependence of the photoelectric effect on the polarization of the incident radiation. This technique was proposed in early 2000 (Costa et al. 2001), and reached full maturity with the gas pixel detector (GPD; Bellazzini et al. 2006), now acquiring data on board the Imaging X-ray Polarimetry Explorer (IXPE) which was launched by NASA in 2021 (Weisskopf et al. 2022), and it will be installed on the future Chinese mission, enhanced X-ray Timing and Polarimetry (eXTP; Zhang et al. 2018). Such an instrument combines good imaging capabilities and unprecedented polarization sensitivity and has already opened a new path for the future of astrophysics. The typical energy range of these focusing X-ray polarimeters is between 1 and 10 keV, with a highly variable effective area within this range depending on the instrument focusing optics and detec-tors¹. Given the power-law nature of the spectra of virtually all astrophysical X-ray sources, it is crucial to have the best polarization sensitivities in the lower end of the energy bandwidth. For the reasons we illustrate in the next section, the X-ray polarization direction of the lower energy X-rays is also the most difficult to measure with the GPD. Indeed, at low energies all the present reconstruction methods, both the state-of-the-art analytic one developed by the IXPE collaboration and the recently developed machine learning (ML) techniques, show the biggest limitations. Moreover, all these reconstruction strategies suffer from a systematic effect called “polarization leakage” (Bucciantini et al. 2023), which we briefly discuss in Sect. 3.

In this paper, we propose a hybrid analytic-ML approach, in which we exploit a ML algorithm based on a convolu-tional neural network (CNN) to improve the performance of the state-of-the-art analytic algorithm, and to mitigate the polarization leakage effect. The working principle of the GPD and the data set used for the analysis are discussed in Sect. 2. In Sect. 3, the overall features of the reconstruction methods are depicted, with a focus on the relevance of the impact point parameter. In Sect. 4, we describe the structure of the adopted CNN and illustrate its training, optimization processes, and the results regarding the reconstruction of the impact point location. In Sect. 5, we present the polarization results we obtain with our hybrid algorithm. Section 6 summarizes the results and provides considerations for future applications and developments.

2 Instrument and data set

The complete description of how the GPD functions can be found in Bellazzini et al. (2006). Here we only summarize the general concept behind the use of GPDs to measure the X-ray polarization. As mentioned in the introduction, the instrument functioning is based on the photo-electric effect, in which an X-ray is absorbed in the gas gap of the GPD², and a photoelec-tron (PE) is ejected in the direction (θ, ϕ), namely the emission direction, which preferably lies on the oscillation plane of the electric field of the incoming X-ray. We note that θ is the angle between the incident X-ray direction and the PE emission direction, while ϕ is the azimuthal PE emission direction. The PE interacts with the gas atoms through ionizing collisions, losing energy at each collision³. Such interactions generate a pattern of ion-electron pairs called “track” that marks the path followed by the PE before losing all its energy and being reabsorbed in the gas. The primary e⁻ charges are amplified and collected on a plane of hexagonal pixels in a honeycomb configuration. A track, therefore, is a pixellated image containing useful information about the PE, and, as a consequence, about the X-ray that generated that same PE. Two examples of a PE track image for two different X-ray energies are reported in Fig. 1.

Our data set consists of PE tracks generated through a Monte Carlo (MC) simulation software, named “ixpesim” (Di Lalla 2019), which relies on GEANT4 and uses a slightly customized version of the “Livermore Polarized” physics list. The data set consists of three different categories of simulations.

Firstly, we generated two million events from an unpolar-ized beam, with a flat energy spectrum in the range 1.0–9.0 keV. This energy range includes the energies of the highest sensitivity (mostly due to the higher effective area) of current and future experiments adopting this X-ray polarization technique, such as IXPE and eXTP. This portion of the data set was used to train and validate our CNN. The simulated gas pressure was set at 720 mbar for all the events we generated. Secondly, we generated test samples by simulating 100000 events for each set at fixed energies: two sets (100% polarized and unpolarized) for 13 different mono-chromatic beams (between 2 keV and 8 keV with a 0.5 keV step). These events were used to evaluate and compare the performance of the algorithms for different energy values of the incident X-rays. Finally, we generated three sets of 500 000 events each, simulating three different unpolarized point sources with typical spectral shapes of astrophysical sources. Specifically, we simulated two power-law spectra (with −0.7 and −1.7 spectral indices) and one blackbody spectrum. This data set was used to evaluate and study the polarization leakage effect.

For each MC track, we know the true polarization angle, the true X-ray energy and impact point, and the true PE emission angle.

Fig. 1

Example of a photoelectron track generated by a 7 keV photon (top panel) and a 3.5 keV photon (bottom panel). The dashed lines and the black points represent the simulated PE emission direction and X-ray impact point, respectively. The simulated photoelectron path is reported as well. The color intensity of the pixels is proportional to the energy deposited in the gas by the photoelectron.

3 Reconstruction methods

The event reconstruction consists of the estimation of the properties of interest of the incoming X-ray photon through the PE track. In particular, the total collected charge is proportional to the energy of the X-ray, the starting point of the track gives the impact point (IP) of the photon, and the azimuthal angle ϕ of the PE emission direction (before it gets deviated by multiple interactions in the gas) carries the memory of the X-ray polarization direction. While the latter is the track parameter that directly provides the actual information about the polarization properties of the incident X-rays⁴, the photon impact point represents a key feature in the reconstruction process, both to avoid biases and to improve the general performance of the algorithms.

As shown in Fig. 1, the track morphology strongly depends on the energy of the absorbed X-ray. For low energy X-rays, the PE path is generally a few pixels long, and the track is essentially round. The lack of elongation, as well as the absence of skewness in the spatial distribution of the charge make the reconstruction of the properties of the tracks generated by low energy X-rays more challenging.

Fig. 2

Example of a 7.5 keV PE track with all the parameters reconstructed with moment analysis. The blue point and the dashed blue line are the barycenter and the direction of maximum elongation of the track, respectively. The green point and the green line are the reconstructed impact point and the reconstructed emission direction, respectively. The black point and the dashed black line are the true (MC) impact point and emission direction, respectively.

3.1 Analytic reconstruction with moment analysis

The state-of-the-art algorithm currently used by the IXPE collaboration (Bellazzini et al. 2003) is based on an analysis of the momenta of the track image, and hence called moment analysis. In short, this algorithm relies on the morphological properties, especially the elongation, of the track as well as on the deposited energy in each pixel to estimate both the impact point of the X-ray and the PE emission direction.

The details of the process are reported in Bellazzini et al. (2003) and summarized in Appendix A. Here we report only the final step of the analysis. Once the impact point parameter was reconstructed, the second moment of the charge distribution with respect to the location of the impact point was calculated as follows: (1)

where (x_i, y_i) is the position of each i_th pixel, and w_i is the weight assigned based on the skewness of the track. The emission angle was obtained by evaluating the angle ϕ with respect to the x axis which maximized . In Fig. 2, a track image along with the parameters reconstructed with moment analysis is shown.

Fig. 3

Modulation factor as a function of energy for the standard moment analysis (black) and for the same analysis but substituting the predicted IP with the true one (red).

3.1.1 Impact point and modulation factor

The impact point of the incident photon is not directly linked to its polarization properties, but it plays an important role in the emission angle reconstruction with moment analysis. An incorrect estimation of the IP inevitably affects the subsequent reconstruction of the emission angle, as it is involved in Eq. (1).

In order to compare the performance of different algorithms, we used a figure of merit, the “modulation factor”. Denoted as µ, it represents the response of a polarimeter to a 100% polarized source as the normalized half-counting rate difference. It ranges between zero (no sensitivity) and one (maximum sensitivity). Its calculation through Stokes parameters is described in Kislat et al. (2015). The sensitivity of a polarimeter is affected both by the instrument limitations, and by the performance and the efficiency of the reconstruction algorithm. In this work we focus on the effects linked to the reconstruction algorithm.

In Fig. 3, the modulation factor as a function of the energy is reported in black for the standard moment analysis. The worsening of µ with decreasing energy is due to the difficulties in reconstructing the correct emission angles of low energy photoelectron tracks. In the same figure, the red line represents the modulation factor obtained by substituting the predicted impact point with the MC (true) one in Eq. (1). This substitution results in a significant improvement in the performance of the algorithm. Clearly this procedure cannot be applied to experimental data, as we do not know the true IP location, but it highlights how improving the precision in the prediction of the IP position could lead to improved precision in the determination of the emission angle.

3.1.2 Impact point and polarization leakage

Recently, a systematic effect denominated polarization leakage has been found in IXPE measurements and discussed in Bucciantini et al. (2023). Due to the poor reconstruction of the correct impact point from PE tracks, some astrophysical sources exhibit an induced polarization pattern associated with intensity edges and gradients.

This effect can be particularly appreciated when analysing unpolarized point sources. For these sources polarization leakage can cause an induced radial polarization. For this study we thus considered the three simulated unpolarized point sources described in Sect. 2. In order to observe a potential induced radial polarization - since the pattern is, by definition, radially symmetric around the center of the point source – we measured the PE emission angle with respect to the reference axis defined by the radial direction of the impact point. A schematic representation of the radial alignment is reported in Fig. 4.

Firstly, for each event, we calculated the radial direction as follows: (2)

where (x_rec − x_s) and (y_rec − y_s) are the horizontal and vertical distances of the reconstructed IP from the simulated source position. In our case, (x_s,y_s) = (0,0). We then aligned the Stokes parameters to the radial direction, (3) (4)

where Q_rec(0) = 2cos(2ϕ_rec(0)), U_rec(0) = 2 sin(2ϕ_rec(0)), and ϕ_rec denotes the reconstructed emission angle direction. From the Stokes parameters, we calculated the polarization degree and angle as described in Kislat et al. (2015). The residual radial modulation obtained with the standard moment analysis for our sources is reported in Table 1.

From Table 1 we notice that the moment analysis finds residual radial modulations up to 6.58 ± 0.23%, even if the simulated sources are unpolarized. The same effect can be noticed by binning the area surrounding the source and evaluating the quantities Q/I and U/I, with Q, U, and I being the Stokes parameters. In Fig. 5 we report the results for the blackbody spectrum source. We would again expect values compatible with zero, while we can notice a clear pattern that indicates a residual radial polarization.

Fig. 4

Representation of the radial alignment for a single event. The red and black dots are the source and the predicted IP positions, respectively. The black arrow is the predicted emission direction. The dashed red and black lines are the radial and standard reference axes, respectively, while the red and black angles are the radial and standard predicted emission angles, respectively.

3.2 CNN-based algorithms

Several groups have already shown the potential of ML to reconstruct the PE track and determine the incident photon properties. For example, Kitaguchi et al. (2019) trained a CNN to learn how to classify the PE tracks into polarization angle bins, and the same NN also predicts the impact point position on the track. Moriakov et al. (2020) tested the use of CNNs through regression instead of classification, again to infer the impact point and the PE initial emission direction. Peirson et al. (2021) and Peirson & Romani (2021) also used CNNs to predict the energy, impact point, and polarization direction, evaluating prediction uncertainties for each event thorough a deep ensemble technique (Lakshminarayanan et al. 2017). In all of these works, the CNN architecture and the preprocessing phase of the images were focused on the reconstruction of the emission angle, with the IP prediction being a “secondary” product of the reconstruction.

Our work is intended to build and optimize a CNN to specifically reconstruct the IP location. By substituting the IP predicted by moment analysis with the one predicted by the CNN in Eq. (1), we aim both to improve the performance of the algorithm and to mitigate some systematic effects, such as the polarization leakage effect.

Table 1

Summary of the residual radial modulation calculated with standard moment analysis for the unpolarized point sources.

Fig. 5

Binned and interpolated calculation of Q/I and U/I, with Q, U, and I being the Stokes parameters, for the blackbody spectrum source (BB). The source is located at the center of the GPD, (x_s, y_s) = (0,0).

4 CNN for the impact point reconstruction

4.1 The CNN architecture

We built a CNN based on the DenseNet-121 (Huang et al. 2016) architecture, which we modified to incorporate hexagonal convolutions at minimum performance cost. We summarize the key features of our CNN below.

We started with implementing hexagonal convolution layers as a C++ extension for PyTorch, since a reference Python-only implementation of hexagonal convolutions in Steppa & Holch (2019) is substantially slower compared to the standard 2D Cartesian convolutions. Incorporating hexagonal convolutions allows the network to correctly capture the track structure and the spatial dependencies between pixels on the hexagonal pixel grid, which is particularly important in the initial layers of the CNN. Therefore, we employed hexagonal convolutions in the first convolution block, which consists of a stack of three hexagonal convolution layers with 64 filters in each layer, stride 1 and kernel radius 1. Each convolution layer here is followed by a batch normalization layer and rectified linear unit (ReLU) activation function.

For performance reasons, after the initial hexagonal convolution block, a transition was performed from the hexagonal grid to the Cartesian grid by applying a hexagonal convolution layer on a Cartesian subgrid of the hexagonal grid with stride 2. This “transition convolution” has kernel radius 1 and 64 filters as well, where the combination of stride 2 and kernel radius 1 implies that this transition convolution gathers image features from the entirety of the hexagonal grid. The transition convolution is followed by a batch normalization layer and ReLU activation. Subsequently, we switched to the standard Dense-Blocks of the DenseNet-121 using Cartesian 2D convolutions only. A fully connected layer for impact point regression was applied to the resulting final CNN feature map with dimensions 6 × 6. Throughout the network, dropout layers with a dropout probability of 10% were used inside the DenseBlocks.

The network was built to predict the impact point position (x_true, y_true). The information is given in the number of decimal pixels (0 < x_true y_true) < 72), and we used the following loss function: (5)

We used the Adam optimizer (Kingma & Ba 2014), a commonly employed type of stochastic gradient descent, with a decaying learning rate starting from 10⁻⁴. A form of online hard example mining (OHEM; Shrivastava et al. 2016) was employed during training to improve the general performance of the network, where in each batch only 50% of samples with the worst performance was taken into account during gradient computation.

4.2 Preprocessing, optimization, and training of the CNN

As described in Sect. 2, our data set consists of PE tracks generated through MC simulations. A preprocessing of the track images was needed in order to prepare the sample for the CNN.

The characteristic detector noise was realistically taken into account in the simulations and generated a background of nonzero value pixels in the frame of the track images. To suppress such a noise, we set all pixels with values below a threshold of 20 analog to digital converter (ADC) counts to zero (which corresponds to ~45 electrons; Baldini et al. 2021). Additionally, pixels with values above the threshold but disconnected from the main pixel cluster of the track were also set to zero. This process facilitated and accelerated the CNN training by removing useless information. However, we also tested a CNN with noisy images and the performance was not significantly worse: the CNN on its own learns how to ignore the information carried by the random pixel noise. The pixel values were then rescaled to between the range of zero and one.

In order to improve the precision of the network in identifying the correct IP position, we introduced a “sharpening” process of the images. For each pair of adjacent pixels, a new pixel was added halfway between them, and its value was linearly interpolated between the values of the two neighbor pixels. This process allowed us to preserve the hexagonal symmetry of the pixel matrix, while artificially increasing the sharpness of the images. In Fig. 6, an example of a track image before and after the sharpening process is shown. A detailed description of the process is reported in Appendix B.

Before giving the tracks as input to the network, images were reshaped in a 72 × 72 pixels frame. This size was chosen to reduce the training time as much as possible (the larger the images are, the larger the number of parameters in the network architecture, and the longer the time requested to process them) without cropping a significant number of tracks at high energies. We evaluated the learning ability of the network with a validation data set, and we chose to train the network for 60 epochs, with the OHEM process having been introduced in the 30th epoch.

Fig. 6

Example of a PE track before (top panel) and after (bottom panel) the sharpening process.

4.3 Impact point reconstruction results

After the training process, we tested our network with an independent data set, and we compared the results both with a network trained with unsharpened images and with the standard moment analysis. In order to lower the number of cropped tracks even further and to remove any possible bias introduced by the network, we first rotated each image by 120° and 240°, we then predicted the impact point for each of these three images, and finally we rotated the predictions back. These rotations allowed us to maintain unaltered the original hexagonal symmetry of the image. The reconstructed IP is the mean value of the three predictions made by the network.

To evaluate the performance of the models in the reconstruction of the IP position, we used three different figures of merit. We show the results as a function of energy in Fig. 7. The mean distance between the predicted IPs and the true ones is reported in the top panel; the percentage of events for which the distance between the true IP position and the predicted one is smaller than one pixel (middle panel, % < 1 pixel) and two pixels (bottom panel, % < 2 pixel) are reported. The unit “pixel” we report in these figures of merit indicates the standard pixel dimension, not the sharpened one, for both the moment analysis and the two CNNs.

Firstly, the sharpening process allowed the CNN to reach a higher precision in the prediction of the IP location, as the mean distance for the sharpened CNN is lower than the one for the standard CNN in almost the entire energy range, while the percentage of events for which the distance is lower than one or two pixels is higher. For example, at 3 keV (6 keV) the % < 1 pixel is up by ~11% (16%) and the % < 2 pixel is up by ~6% (2%). It should be noted that the standard CNN overcomes the performance of the sharpened one at very high energy, and this is probably due to a higher number of very long tracks that are cropped with the sharpening process. This could be solved by extending the frame dimension, but it would result in slowing down the algorithm. As IXPE’s effective area is very low at very high energies, we have given priority to speeding up the algorithm. However, if this needs to be used in other similar applications, this aspect might need further care.

Moreover, the performance of the sharpened CNN is significantly better compared to the moment analysis for the entire energy range and according to all three figures of merit. For example, at 3 keV (6 keV) the % < 1 pixel is up by ~3% (29%) and the % < 2 pixel is up by ~13% (12%). It is interesting to notice how the % < 2 pixel for the CNN is consistently higher than 80%, showing how the network is also accurate in the identification of the IP position when the tracks become longer. This feature is key in order to reduce the radial modulation induced by the polarization leakage effect.

In the top panel, the mean distance between the barycenter of the track and the true IP position is reported too. In the very low energy tracks, the true IP is very close to the barycenter, while the predictions of the moment analysis are likely to determine the IP position in a peripheral area of the track. Therefore, in the standard moment analysis, the prediction of the IP position could be manually substituted by the barycenter position to improve the precision at low energies. However, when handling data without MC information, it is not trivial to select the energy threshold from where the barycenter should be employed as the predicted IP, and it is not applied in the standard analysis. The CNN, on the other hand, automatically follows the trend of the barycenter for low energies.

We conducted an analysis of the three simulated unpolarized point sources to determine if the improved detection of the IP could result in a reduction of the point spread function (PSF) of the instrument. By considering only the contributions to the PSF from the GPD, we found that the new CNN-predicted IP resulted in a half-power diameter (HPD), that is to say the diameter within which half of the collected X-rays are enclosed, that was approximately 30% lower than the standard method (Weisskopf et al. 2022). However, the dominant contributors to the total PSF of IXPE are the mirror modules, and HPD is at least approximately three times higher than the GPD one. If we take all the contributors into account, the overall improvement in the HPD employing the CNN-predicted IP was limited to only 1–2%.

Fig. 7

Comparison of the IP position reconstruction between our sharpened-image CNN (red line), an unsharpened-image CNN (orange line), and the moment analysis (black line). In the top panel, the mean distance between the true IP position and the predicted one is reported. The dashed black line represents the distance between the true IP and the barycenter of the track. In the middle and bottom panel we reported the percentage of events for which the distance between the true and the predicted IP position is lower than one and two pixels, respectively.

Fig. 8

Modulation factor as a function of energy for the standard moment analysis (black) and for our hybrid algorithm (red).

5 Polarization results and discussion

The CNN-predicted IP was then passed to the moment analysis and introduced in Eq. (1) to predict the polarization of our testing samples and to evaluate the modulation factor as a function of energy. We report the results in Fig. 8.

Despite the substantially improved precision in the IP reconstruction at low energies (E < 3.5 keV), the enhancement in the modulation factor value is marginal, around 1%. At higher energies the improvement is more significant, up to ~ 6% at 6 keV.

We verified that the response to an unpolarized sample carries no residual modulation, by processing the three data sets simulating the unpolarized point sources (the two power-law spectra, PL1 and PL2, and the blackbody spectrum, BB). We calculated the residual modulation without aligning the predicted emission angle to the radial direction with the standard moment analysis and with the hybrid method. As expected, both algorithms found no residual modulation for all three sources, as reported in Fig. 9 and in Table 2.

We then analyzed the events of the unpolarized sources aligning the predicted emission angle to the radial direction to investigate the polarization leakage. We calculated the radial residual modulation for the standard moment analysis, for the same analysis but employing the barycenter as the best estimation of IP at low energies⁵, and for our hybrid method. The results are reported in Fig. 10 and in Table 3.

As already mentioned in Sect. 3.1.2, with the standard moment analysis, we measured residual modulations that are not compatible with zero, up to 6. 58 ± 0.23%, even if the simulated point sources are unpolarized. We notice that employing the barycenter as the estimation of IP at low energies slightly mitigates this effect.

With our hybrid method, we can reduce the residual modulation, by a factor up to approximately two with respect to the moment analysis, but that depends on the spectrum of the source considered. The effect is not completely eliminated, but the residual modulation is significantly reduced. One possible way to correct the polarization leakage effect involves an analytical PSF modeling (Bucciantini et al. 2023), which, however, causes some residual errors. With our method, if the effect for the source is critical, corrections are still needed, but residuals would be lower.

Fig. 9

Histograms of the reconstructed emission angle for the standard moment analysis (black) and for the hybrid method (red), for the three unpolarized point sources. Modulation factor values are reported in Table 2.

Table 2

Summary of the residual modulation for the unpolarized point sources with no radial direction projection.

Fig. 10

Residual radial modulation for the unpolarized point sources for three different methods: the standard moment analysis in black, the moment analysis that employs the barycenter as an IP prediction in blue, and our hybrid method in red.

Table 3

Summary of the residual radial modulation of the three unpolarized point sources.

6 Conclusions

We built a new hybrid ML-analytic model for the reconstruction of the PE emission angle in a GPD with hexagonal pixels. Our results show a promising improvement in the performance compared to the state-of-the-art analytic algorithm.

We implemented hexagonal convolutional layers as a C++ extension for PyTorch, which speeds up training and inference compared to pure Python/PyTorch implementation. We introduced a sharpening process of the images, resulting in a significant improvement in the reconstruction accuracy of the PE impact point by the CNN. We then modified the standard moment analysis by using this CNN-predicted impact point in Eq. (1). Thanks to this change, our reconstruction algorithm performs better than the basic analytic reconstruction in recovering the modulation factor of 100% polarized monochromatic beams at all tested energies (from ~1% at low energies to ~6% at higher energies), while it correctly predicts null linear polarization for the unpolarized sources. Moreover, employing the CNN-predicted IP in the standard moment analysis mitigates the polarization leakage effect up to a factor approximately two compared to the standard moment analysis.

The results we have reported in this work do not consider any kind of weighting process of the events. As already shown in other works (Peirson & Romani 2021; Di Marco et al. 2022), the performance could be further improved by applying an event-quality selection to remove or weight those events that convert outside the sensitive area of the detector. This topic, as well as the validation of the algorithm with real X-ray calibration data sets are currently being investigated and will be the subject of future study.

Acknowledgements

Portions of this research were conducted with high performance computing resources provided by Louisiana State University (http://www.hpc.lsu.edu). We acknowledge Federica Legger, Sara Vallero, and the INFN Computing Center of Turin for providing support and computational resources, as well as the HPC4AI Laboratory of the University of Torino.

Appendix A Moment analysis, a step-by-step description

As already mentioned in Sec. 3.1, the moment analysis aims to reconstruct analytically the track parameters exploiting its morphological properties. Here we provide the main steps of the analysis, while a more detailed description is reported in Bellazzini et al. (2003).

Firstly, the barycenter of the charge distribution was calculated (blue dot in Fig. 2) as follows: (A.1)

where q_i is the charge collected in each i_th pixel, and (x_i, y_i) is the position of the pixel center on the readout plane. Defining ϕ as the angle respect to the x axis, the second moment of the charge distribution M₂(ϕ) referred to the barycenter and to the direction defined by ϕ was then calculated: (A.2)

The maximum and minimum values of M₂(ϕ) correspond to the direction of the maximum and minimum elongation of the track, and they were obtained by imposing (the dashed blue line in Fig. 2 corresponds to the maximum elongation for the track). The third moment of the charge distribution (A.3)

allowed for the identification of the initial part of the track, as the PE lost more and more energy as it traveled through the gas, eventually forming the so-called “Bragg peak” when it was reabsorbed. Once the initial part of the track was selected, its barycenter was calculated. This new point was used to evaluate weights for each pixel in the whole track, as (A.4)

where d_b,i is the distance between each track pixel and the position of the barycenter of the initial part of the track, and d_s is a scale parameter. The impact point was then defined as follows (green dot in Fig. 2): (A.5)

The second moment of the charge distribution was again calculated, this time with respect to the location of the impact point (x_IP, y_IP) and with the weighted pixels, (A.6)

and the emission angle was obtained by evaluating the angle ϕ which maximizes (dashed green line in Fig. 2).

Appendix B Artificial sharpening: Detailed description and considerations

The moment analysis requires the information on the impact point position to be in millimeters, while the output of our network is in the number of pixels. It is important then to build a map from one coordinate system to the other, and to modify it accordingly when sharpening the images. The two coordinate systems are shown in Fig. B.1 (Baldini et al. 2021).

Knowing the number of rows and columns (N_row, N_col), and the horizontal and vertical distance between the centers of the hexagonal pixels (p_row, p_col), we can build a map from the position as a pixel number (i, j) to the position in millimeters (x, y): (B.1) (B.2)

When sharpening the images, a possible solution to preserve the spatial information of the track is to consider different GPD parameters and a change in the coordinate system. Specifically: (B.3) (B.4)

The (x, y) position on the readout plane with respect to the new coordinates is the following: (B.5) (B.6)

Another factor that needs to be taken into account is that the image is passed to the network as a square image, and only is the CNN first convolutional layer interpreted as hexagonal. As we aim to achieve a high precision in the determination of the IP position, we need to correctly locate the IP on the square grid before giving the image as input to the network.

We consider six impact points in the same portion of an image in Fig. B.2, where each point represents the center of a pixel and each number is a hypothetical impact point position. In blue the pixels of the standard images are reported, while in red the ones added with the sharpening process are provided. A positional bias could occur for an IP whose j value is not an integer, that is the IP is not located on a horizontal axis defined by the pixel center positions (IPs 3, 4, 5, and 6 in the same figure). After the (x, y) → (i, j) conversion, if we do not correct the (i, j) values, the six IPs will be located on the square grid as in the upper panel of Fig. B.2. A correction of the IP positions proportional to their distance from the closest integer j value is needed to obtain a configuration as in the lower panel of Fig. B.2, which is more consistent with the positions of the IP on the hexagonal grid.

Fig. B.1

Scheme of the readout plane of the GPD. The (x, y) coordinate system has its origin in the center of the GPD, while the numbers on the borders of the pixel matrix refer to the (i, j) coordinate system. The horizontal and vertical distance between the center of the pixels is reported too. Image credits: Baldini et al. (2021)

Fig. B.2

Scheme of an image pixel structure before and after the conversion from a hexagonal grid to a square grid. Each point represents a pixel center (the standard image pixels are reported in blue, and the ones added with the sharpening process are reported in red), while each number is an impact point position. In the top panel, the position of the IPs on the square grid without corrections is reported, while in the bottom panel the position of the IPs on the square grid takes a fine-tuning correction into account.

References

Baldini, L., Barbanera, M., Bellazzini, R., et al. 2021, Astropart. Phys., 133, 102628 [NASA ADS] [CrossRef] [Google Scholar]
Batterman, B. W., & Cole, H. 1964, Rev. Mod. Phys., 36, 681 [NASA ADS] [CrossRef] [Google Scholar]
Bellazzini, R., Angelini, F., Baldini, L., et al. 2003, Proc. SPIE, 4843, 383 [Google Scholar]
Bellazzini, R., Angelini, F., Baldini, L., et al. 2006, Nucl. Instrum. Methods Phys. Res. A, 560, 425 [CrossRef] [Google Scholar]
Bucciantini, N., Di Lalla, N., Romani, R. W. R., et al. 2023, A&A, 672, A66 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Costa, E., Soffitta, P., Bellazzini, R., et al. 2001, Nature, 411, 662 [NASA ADS] [CrossRef] [Google Scholar]
Di Lalla, N. 2019, Ph.D. Thesis, University of Pisa, Italy [Google Scholar]
Di Marco, A., Costa, E., Muleri, F., et al. 2022, AJ, 163, 170 [NASA ADS] [CrossRef] [Google Scholar]
Huang, G., Liu, Z., van der Maaten, L., & Weinberger, K.Q. 2016, ArXiv e-prints [arXiv:1608.06993] [Google Scholar]
Kingma, D., & Ba, J. 2014, Proceedings of the 3rd International Conference on Learning Representations [Google Scholar]
Kislat, F., Clark, B., Beilicke, M., & Krawczynski, H. 2015, Astropart. Phys., 68, 45 [NASA ADS] [CrossRef] [Google Scholar]
Kitaguchi, T., Black, K., Enoto, T., et al. 2019, Nucl. Instrum. Methods Phys. Res. A, 942, 162389 [Google Scholar]
Lakshminarayanan, B., Pritzel, A., & Blundell, C. 2017, in NIPS’17 (Red Hook, NY, USA: Curran Associates Inc.), 6405 [Google Scholar]
Maidment, A. 2006, X-Ray Polarization Imaging (University of Pennsylvania) [CrossRef] [Google Scholar]
Moriakov, N., Samudre, A., Negro, M., et al. 2020, ArXiv e-prints [arXiv:2005.08126] [Google Scholar]
Novick, R., Weisskopf, M., Berthelsdorf, R., Linke, R., & Wolff, R. 1972, ApJ, 174, L1 [NASA ADS] [CrossRef] [Google Scholar]
Peirson, A. L., & Romani, R. W. 2021, ApJ, 920, 40 [NASA ADS] [CrossRef] [Google Scholar]
Peirson, A. L., Romani, R. W., Marshall, H. L., Steiner, J. F., & Baldini, L. 2021, Nucl. Instrum. Methods Phys. Res. A, 986, 164740 [Google Scholar]
Schnopper, H. W., & Kalata, K. 1969, AJ, 74, 854 [NASA ADS] [CrossRef] [Google Scholar]
Shrivastava, A., Gupta, A., & Girshick, R. 2016, ArXiv e-prints [arXiv:1604.03540] [Google Scholar]
Steppa, C., & Holch, T. L. 2019, SoftwareX, 9, 193 [NASA ADS] [CrossRef] [Google Scholar]
Weisskopf, M. C., Silver, E. H., Kestenbaum, H. L., Long, K. S., & Novick, R. 1978, ApJ, 220, L117 [NASA ADS] [CrossRef] [Google Scholar]
Weisskopf, M. C., Soffitta, P., Baldini, L., et al. 2022, J. Astron. Telesc. Instrum. Syst., 8, 026002 [NASA ADS] [CrossRef] [Google Scholar]
Zhang, S., Santangelo, A., Feroci, M., et al. 2018, Sci. China-Phys. Mech. Astron., 62, 029502 [Google Scholar]

Specifically, IXPE is optimized and calibrated to work between 2 and 8 keV.

The gas mixture used for IXPE is Dimethyl Ether ((СН₃)₂O).

The energy loss is inversely proportional to the kinetic energy of the electron: ∝ ∝ , where β is the velocity of the electron in units of c and E_kin is its kinetic energy.

⁴

The polarization degree and angle were statistically estimated on the basis of the PE azimuthal distribution.

⁵

As reported in the top panel of Fig. 7, the barycenter is closer to the true IP with respect to the prediction of the moment analysis for energies lower than 3 keV.

All Tables

Table 1

Summary of the residual radial modulation calculated with standard moment analysis for the unpolarized point sources.

In the text

Table 2

Summary of the residual modulation for the unpolarized point sources with no radial direction projection.

In the text

Table 3

Summary of the residual radial modulation of the three unpolarized point sources.

In the text

All Figures

	Fig. 1 Example of a photoelectron track generated by a 7 keV photon (top panel) and a 3.5 keV photon (bottom panel). The dashed lines and the black points represent the simulated PE emission direction and X-ray impact point, respectively. The simulated photoelectron path is reported as well. The color intensity of the pixels is proportional to the energy deposited in the gas by the photoelectron.
In the text

Fig. 2

In the text

	Fig. 3 Modulation factor as a function of energy for the standard moment analysis (black) and for the same analysis but substituting the predicted IP with the true one (red).
In the text

	Fig. 4 Representation of the radial alignment for a single event. The red and black dots are the source and the predicted IP positions, respectively. The black arrow is the predicted emission direction. The dashed red and black lines are the radial and standard reference axes, respectively, while the red and black angles are the radial and standard predicted emission angles, respectively.
In the text

	Fig. 5 Binned and interpolated calculation of Q/I and U/I, with Q, U, and I being the Stokes parameters, for the blackbody spectrum source (BB). The source is located at the center of the GPD, (x_s, y_s) = (0,0).
In the text

	Fig. 6 Example of a PE track before (top panel) and after (bottom panel) the sharpening process.
In the text

Fig. 7

In the text

	Fig. 8 Modulation factor as a function of energy for the standard moment analysis (black) and for our hybrid algorithm (red).
In the text

	Fig. 9 Histograms of the reconstructed emission angle for the standard moment analysis (black) and for the hybrid method (red), for the three unpolarized point sources. Modulation factor values are reported in Table 2.
In the text

	Fig. 10 Residual radial modulation for the unpolarized point sources for three different methods: the standard moment analysis in black, the moment analysis that employs the barycenter as an IP prediction in blue, and our hybrid method in red.
In the text

	Fig. B.1 Scheme of the readout plane of the GPD. The (x, y) coordinate system has its origin in the center of the GPD, while the numbers on the borders of the pixel matrix refer to the (i, j) coordinate system. The horizontal and vertical distance between the center of the pixels is reported too. Image credits: Baldini et al. (2021)
In the text

Fig. B.2

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.