A Machine Learning Approach to Detecting Albedo Anomalies on the Lunar Surface Using Data-Driven Methods

Sofia Strukova
Department of Information and Communication Engineering
University of Murcia
Murcia, Spain 30100
strukovas@um.es
&Sergei Gleyzer
Department of Physics & Astronomy
University of Alabama
Tuscaloosa, AL 35401
sgleyzer@ua.edu
&Patrick Peplowski
Johns Hopkins University Applied Physics Laboratory
Laurel, MD 20723
Patrick.Peplowski@jhuapl.edu
&Jason P. Terry
Department of Physics & Astronomy
University of Georgia
Athens, GA 30602
jpterry@uga.edu
Abstract

This study introduces a data-driven approach using machine learning (ML) techniques to explore and predict albedo anomalies on the Moon’s surface. The research leverages diverse planetary datasets, including high-spatial-resolution albedo maps and element maps (LPFe, LPK, LPTh, LPTi) derived from laser and gamma-ray measurements. The primary objective is to identify relationships between chemical elements and albedo, thereby expanding our understanding of planetary surfaces and offering predictive capabilities for areas with incomplete datasets. To bridge the gap in resolution between the albedo and element maps, we employ Gaussian blurring techniques, including an innovative adaptive Gaussian blur. Our methodology culminates in the deployment of an Extreme Gradient Boosting Regression Model, optimized to predict full albedo based on elemental composition. Furthermore, we present an interactive analytical tool to visualize prediction errors, delineating their spatial and chemical characteristics. The findings not only pave the way for a more comprehensive understanding of the Moon’s surface but also provide a framework for similar studies on other celestial bodies.

1 Introduction

The study of planetary surfaces provides a window into the geologic history of celestial bodies, allowing scientists to piece together the evolutionary stories of planets and moons. The tools used to study these surfaces vary from telescopes on Earth to intricate instruments aboard spacecrafts. Through electromagnetic wavelengths, ranging from radar to gamma-ray, the mysteries of the surfaces are unveiled, each wavelength offering a different perspective. Notably, these wavelengths shed light on the surface’s chemistry, mineralogy, and past events. The Moon, for instance, has been extensively observed and its albedo has been correlated with the presence of specific elements like iron. However, when it comes to other celestial bodies, the relationships between observations may not be as well-established, and the datasets can be incomplete. Thus, a challenge of accurately predicting the properties of a planetary surface with limited information arises.

Planetary surface observations can be studied through a wide range of electromagnetic wavelengths (e.g., radar, infrared, optical, ultraviolet, x-ray, gamma-ray). Each wavelength provides unique information about the surface’s chemistry, mineralogy, and history. This includes reflected solar light, the magnitude of which is known as surface albedo. Yet, the information is not entirely independent. For example, the chemical element iron, which is mapped with x-rays and gamma rays, is highly related to optical albedo on the Moon. Knowing this, we can develop high-spatial-resolution predictive maps of iron based on optical data. On other planets, the relationships between the observations are less well-known, and some datasets are missing. We seek to study planetary surfaces by inputting maps of surfaces at all wavelengths available to discover the relationships between the measurements and to make predictions about chemistry that are not directly sampled by observations. This provides a way of studying the geologic history of a planet with existing data, which is valuable given the infrequent opportunities for new measurements by planetary spacecraft.

Many different elements contribute to varying extent to the albedo and the albedo can be related to the underlying elemental distribution. These relationships are generally well-known for the Moon, making it an ideal location to test new approaches to linking measurements at different wavelengths in anticipation of applying these techniques to other planetary surfaces where the relationship is less understood. Thus, we built a predictive model based on the compositional maps of the Moon derived from Lunar Prospector data to quantify the relationships between chemical element concentrations and the surface albedo. Our goals are to make predictions about albedo based on the elemental composition and uncover any possible anomalies that can have roots in the geologic history of a planet. This provides a way of studying the Moon with existing data, which is valuable given the infrequent opportunities for new measurements by planetary spacecraft. Moreover, the shape and the location of anomalies on the lunar surface can provide insights into processes that produced them and their subsequent evolution.

The remainder of this paper is structured as follows. In Section 2, we present our research methodology. Our findings are outlined in Section 3, while we extend the results, present interpretations, suggestions for future work, and limitations in Section 4. Finally, we draw our conclusions and future research directions in Section 5.

2 Methodology

We will pursue the methodology process presented in Figure 1.

Refer to caption
Figure 1: Overview of the methodology to predict the full albedo

In our research, we initially focus on the classification of two distinct datasets: albedo maps sourced from high spatial resolution laser measurements and element maps derived from gamma-ray measurements. The dataset consists of several maps of the Moon from the Lunar Prospector Gamma Ray and Neutron Spectrometer [1]. The element maps include gamma-ray derived concentrations for Fe (iron), K (potassium), Th (thorium), and Ti (titanium). The albedo map was derived from the Lunar Reconnaissance Orbiter LOLA instrument [2]. These datasets inherently possess disparate spatial resolutions, with the former measured in the range of tens of meters and the latter possessing an intrinsic spatial resolution of approximately 45 kilometers.

To address this disparity in resolutions, we conduct our experiment under two specific settings: one involving blurring techniques and one without. In the setting that incorporates blurring, we implement two types of Gaussian blur — standard and adaptive. The application-specific determination of the variance/covariance matrix parameters for the Gaussian filters represents a meticulous aspect of this phase.

Next, we turn our attention to the quantitative evaluation of these blurring techniques. For this purpose, we employ metrics such as the R2 score and the root-mean-square error. Based on this comprehensive assessment, we identify optimal parameters for our blurring process, with sigma set to 9 and kSize at (0,0).

Following the resolution harmonization and blurring evaluation, we proceed to train a machine learning (ML) regression model. For this task, we utilize a subset of the lunar surface as our training data, treating chemical elements as input variables and the albedo as the target variable. The Extreme Gradient Boosting Regression Model emerges as the best-performing ML model, configured with specific hyperparameters - namely, a learning objective of regression with squared loss, a learning rate of 0.1, 30 estimators, and a maximum depth of 5.

With this trained model in hand, we generate predictions for the full albedo map, covering both the original and blurred versions. In parallel, we develop an interactive analytical tool designed to enhance our research insights. This tool provides real-time contour mapping functionalities for regions where the predictive error exceeds a predefined threshold. It also allows for multiple user-configurable parameters, including error thresholds and display configurations.

Through this multifaceted research methodology, we aim to establish a robust and nuanced understanding of the variables affecting lunar albedo, thus setting the stage for future research and lunar explorations.

3 Results

3.1 Blurring

The need for the implementation of image blurring arises from the types of datasets used in this study. Specifically, the disparity in the spatial resolution between the two primary sources of data necessitated an approach to reconcile these differences and facilitate more effective comparative analysis.

Our study relies on two distinct categories of datasets. On the one hand, we utilize albedo maps derived from laser measurements featuring an exceptionally high spatial resolution on the order of tens of meters. Conversely, the elemental maps at our disposal, derived from gamma-ray measurements, offer an intrinsic spatial resolution of approximately 45 kilometers. The drastic incongruity in spatial resolutions between these datasets presents a challenge for direct comparison and analysis.

To address this issue, we conducted experiments in two settings: one incorporating image blurring and the other devoid of it (see Figure 2). This decision was geared toward achieving comparability between the disparate datasets. Furthermore, the technique facilitated the development of a “resolution-tolerant” ML model that is equipped to minimize potential artifacts stemming from variances in resolution — a significant stride toward the ultimate objective of identifying any concealed resolution artifacts.

Refer to caption
Figure 2: Original (in the left) and blurred (in the right) images of the Moon

Moreover, the spatial resolution of the albedo and element composition maps differ by many orders of magnitude. Thus, we created and implemented an adaptive Gaussian blur and compared it against the traditional standard blurring approach. In this setting, each pixel of the image is blurred proportionally, based on the 2D projection of the Moon. In particular, pixels that fall in an X km radius around the target pixel are used for the blurring. This way, the resolution of the albedo map can be adjusted to match the resolution of the element composition maps. The customization of the variance/covariance matrix for the Gaussian filter presented a challenge, given its dependence on specific application scenarios. Accordingly, next, we present the difference between the adaptive and the standard filter (see Figure 3) and additionally a map of how much sigma is applied to each pixel of the image (see Figure 4).

Refer to caption
Figure 3: The difference between the adaptive and the standard blurring
Refer to caption
Figure 4: Maps of the amount of sigma applied to each pixel of the image

In the interest of objectivity, quantitative metrics were employed to ascertain the efficacy of each blurring method. Specifically, in Figure 5 we present the R-squared (R2) score and root-mean-square error (RMSE) which were calculated and compared for each experimental setting.

Refer to caption
Figure 5: R2 score and root-mean-square error for each blurring configuration

Based on these empirical evaluations, optimal parameter values were identified. For subsequent experiments, we settled on a sigma value of 9 and a kernel size of (0,0), thus establishing a robust methodological foundation for future endeavors.

Finally, in order to validate the effectiveness of the chosen blurring technique, a regression model was trained on a subset of the surface to generate albedo maps of the Moon in both its original and blurred incarnations. In particular, the inputs of the model are the chemical elements and the albedo is the target. The final best performing model was the extreme gradient boosting regression model (the corresponding learning objective is regression with squared loss, learning_rate𝑙𝑒𝑎𝑟𝑛𝑖𝑛𝑔_𝑟𝑎𝑡𝑒learning\_rateitalic_l italic_e italic_a italic_r italic_n italic_i italic_n italic_g _ italic_r italic_a italic_t italic_e = 0.1, n_estimators𝑛_𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑜𝑟𝑠n\_estimatorsitalic_n _ italic_e italic_s italic_t italic_i italic_m italic_a italic_t italic_o italic_r italic_s = 30, max_depth𝑚𝑎𝑥_𝑑𝑒𝑝𝑡max\_depthitalic_m italic_a italic_x _ italic_d italic_e italic_p italic_t italic_h = 5). The results are presented in Figure 6. It serves not only as a testament to the efficacy of the blurring methodology but also as a cornerstone for ongoing and future research in this domain.

Refer to caption
Figure 6: Predicted images in the original and blurred Moon

3.2 Interactive Analyzer

In the concluding phase of our study, we designed an interactive analytical tool111The open source code can be found at https://github.com/ML4SCI/MLMapper/tree/main/Lunar_Prospector/ML_for_Planetary_Albedo_Sofia_Strukova in order to be able to meticulously identify and highlight discrepancies between the predicted and actual image data. This tool augments our methodological framework by providing a dynamic and nuanced visualization medium that enables real-time error assessment.

The interactive analyzer (see Figure 7) displays contours around regions of pixels whose predicted error value is higher than a given threshold. The contours are color-coded for interpretive convenience. Specifically, red contours signify a positive error, while black contours denote a negative error. The tool allows adjustments of the error threshold, a minimum area of the regions, as well as other parameters to configure how the image is displayed.

Refer to caption
Figure 7: An example of the interactive analyzer (error threshold = 0.04, minimal area = 40, correct image = original, training image = original)

3.2.1 User-Configurable Parameters

The tool offers an array of adjustable parameters to enhance the granularity of its analyses:

mask_threshold: Only consider those pixels whose prediction error is higher than this.

min_area: Discard contours whose area is smaller than this.

filled: Fills the contours.

correct_img: Image to be used as the correct one for computing the error.

training_img: Image to be used to train the ML model.

background_img: Image to be displayed in the background.

4 Discussion

4.1 Observations

Initial observations based on the results presented above suggest that certain contours can be readily accounted for by intrinsic regional properties. For instance, some areas - such as the Shackleton Crater-display unusually high reflectance relative to the adjacent south polar regions [3]. A considerable proportion of error contours appear to coalesce around young craters and their associated ray systems, hinting at a possible correlation between prediction error and geological age, a topic elaborated upon in the subsequent section of this paper.

Furthermore, we delve into a nuanced error analysis leveraging the capabilities of the interactive tool. Our preliminary observations suggest that prediction errors are not random but correlate with certain geological features and elemental compositions. For instance, errors appear more prominent in regions such as young craters or areas with specific elemental compositions that are not adequately captured in the existing lunar datasets.

This work can be further used as a base for predicting relationships between albedo and chemical composition on other airless bodies in order to obtain precise results and support ongoing and future missions. Mercury is a logical location at which to extend this work. The albedo of the Moon is generally much lower than that of Mercury, but the albedo of the lunar maria is similar to that of Mercury, despite Mercury’s low surface iron centrations [4]. This suggests a more significant role for space weathering on Mercury, which can be quantified via application of the ML techniques described here.

4.2 Optical Maturity

Optical maturity measures how long material has been on the lunar surface, exposed to the harsh space environment. Space exposure can, among other things, darken a material. Thus, two chemically-equivalent rocks with a different exposure age can have different albedo, meaning that bright material may simply be young. This offers the possibility of using the error map to identify younger optically immature materials independently from previously established techniques, as well as identifying young surface regions at a spatial scale at which existing techniques (e.g. crater counting) may not be reliable.

The Lunar Orbiter Laser Altimeter (LOLA) is an instrument on the payload of NASA’s Lunar Reconnaissance Orbiter spacecraft (LRO) [1], which was designed to measure the shape of the Moon by measuring precisely the range from the spacecraft to the lunar surface, and incorporating precision orbit determination of LRO, referencing surface ranges to the Moon’s center of mass. LOLA has five beams and operates at 28 Hz, with a nominal accuracy of 10 cm. Its primary objective is to produce a global geodetic grid for the Moon to which all other observations can be precisely referenced.

From an optical albedo map, an optical maturity map and an optical maturity map, we can notice that the dark mare is absent, and this fact was expected since here, the track composition is not taken into account. This breakdown between composition and optical maturity allows us to conclude that some of the represented regions indeed have low optical maturity, which we highlight in Figure 8.

Refer to caption
Figure 8: Potential light regions explained by low optical maturity

4.3 Limitations

We have faced several limitations during this work. Firstly, it is worth noting that specific elements not represented in the Lunar Prospector dataset — namely Iron (Fe), Titanium (Ti), Potassium (K), and Thorium (Th) — might contribute to the observed prediction errors. These elements collectively constitute no more than 20% of the total chemical inventory at any given location, thereby indicating potential areas for future research.

In addition to elemental representation, another important limitation to consider involves the inherent spatial resolution disparities between the albedo and element maps. While our methodology employs Gaussian blurring techniques to harmonize these differences, the process itself is an approximation and may introduce its own set of artifacts. Furthermore, the choice of the Extreme Gradient Boosting Regression Model, although empirically validated, may still encompass biases or limitations intrinsic to the model itself. For example, the model’s configuration and hyperparameters were optimized for a specific set of data and may not be universally applicable for varying lunar terrains or other celestial bodies. As such, both the resolution harmonization and model choice represent avenues warranting additional scrutiny in future research.

5 Conclusions

We analyzed the chemical composition and albedo of the Moon using ML methods with the goal of characterizing the relationship between albedo and surface chemistry. Because this relationship is well known on the Moon, this work served to verify the validity of our ML-based approach. By making the albedo maps of comparable resolution to the element maps and removing the spatial resolution difference, we made the data more comparable for training and prediction. This was helpful to minimize potential artifacts that arise from the resolution effects. Our resulting predicted albedo map, and difference from the measured abledo, revealed locations where chemistry is not the dominant effect, for example in optically immature regions.

As a part of an additional experiment, we made a prediction including the surrounding pixels. We achieved slightly better results, but more exploration is needed on the methods that could be used in order to implement this feature and to prove that this setting would benefit the overall results. Moreover, we see potential in exploring the opportunities provided by Toolkit for Multivariate Data Analysis with ROOT and its features applied to high-energy particle physics.

6 Acknowledgments

Part of this work was developed during the Google Summer of Code (GSoC) 2021 program as part of the ML4SCI organization.

References

  • [1] T. H. Prettyman, J. J. Hagerty, R. C. Elphic, W. C. Feldman, D. J. Lawrence, G. W. McKinney, and D. T. Vaniman. Elemental composition of the lunar surface: Analysis of gamma ray spectroscopy data from lunar prospector. Journal of Geophysical Research: Planets, 111(E12):n/a–n/a, December 2006.
  • [2] David E. Smith, Maria T. Zuber, Glenn B. Jackson, John F. Cavanaugh, Gregory A. Neumann, Haris Riris, Xiaoli Sun, Ronald S. Zellar, Craig Coltharp, Joseph Connelly, Richard B. Katz, Igor Kleyner, Peter Liiva, Adam Matuszeski, Erwan M. Mazarico, Jan F. McGarry, Anne-Marie Novo-Gradac, Melanie N. Ott, Carlton Peters, Luis A. Ramos-Izquierdo, Lawrence Ramsey, David D. Rowlands, Stephen Schmidt, V. Stanley Scott, George B. Shaw, James C. Smith, Joseph-Paul Swinski, Mark H. Torrence, Glenn Unger, Anthony W. Yu, and Thomas W. Zagwodzki. The lunar orbiter laser altimeter investigation on the lunar reconnaissance orbiter mission. Space Science Reviews, 150(1-4):209–241, May 2009.
  • [3] Maria T. Zuber, James W. Head, David E. Smith, Gregory A. Neumann, Erwan Mazarico, Mark H. Torrence, Oded Aharonson, Alexander R. Tye, Caleb I. Fassett, Margaret A. Rosenburg, and H. Jay Melosh. Constraints on the volatile distribution within shackleton crater at the lunar south pole. Nature, 486(7403):378–381, June 2012.
  • [4] Shoshana Z. Weider, Larry R. Nittler, Richard D. Starr, Timothy J. McCoy, and Sean C. Solomon. Variations in the abundance of iron on mercury’s surface from messenger x-ray spectrometer observations. Icarus, 235:170–186, 2014.