Open AccessData Descriptor

Multi-Scale Earthquake Damaged Building Feature Set

Guorui Gao

^1,2,

Futao Wang

^1,2,*,

Zhenqing Wang

^1,2

Qing Zhao

¹,

Litao Wang

^1,2,

Jinfeng Zhu

¹,

Wenliang Liu

¹,

Gang Qin

^1,2 and

Yanfang Hou

Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

University of Chinese Academy of Sciences, Beijing 100049, China

Author to whom correspondence should be addressed.

Data 2024, 9(7), 88; https://doi.org/10.3390/data9070088

Submission received: 14 April 2024 / Revised: 19 June 2024 / Accepted: 24 June 2024 / Published: 28 June 2024

(This article belongs to the Section Spatial Data Science and Digital Earth)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Earthquake disasters are marked by their unpredictability and potential for extreme destructiveness. Accurate information on building damage, captured in post-earthquake remote sensing images, is critical for an effective post-disaster emergency response. The foundational features within these images are essential for the accurate extraction of building damage data following seismic events. Presently, the availability of publicly accessible datasets tailored specifically to earthquake-damaged buildings is limited, and existing collections of post-earthquake building damage characteristics are insufficient. To address this gap and foster research advancement in this domain, this paper introduces a new, large-scale, publicly available dataset named the Major Earthquake Damage Building Feature Set (MEDBFS). This dataset comprises image data sourced from five significant global earthquakes and captured by various optical remote sensing satellites, featuring diverse scale characteristics and multiple spatial resolutions. It includes over 7000 images of buildings pre- and post-disaster, each subjected to stringent quality control and expert validation. The images are categorized into three primary groups: intact/slightly damaged, severely damaged, and completely collapsed. This paper develops a comprehensive feature set encompassing five dimensions: spectral, texture, edge detection, building index, and temporal sequencing, resulting in 16 distinct classes of feature images. This dataset is poised to significantly enhance the capabilities for data-driven identification and analysis of earthquake-induced building damage, thereby supporting the advancement of scientific and technological efforts for emergency earthquake response.

Dataset: https://github.com/ziwen-hash/MEDBFS (accessed on 22 April 2024).

Dataset License: CC-BY-NC.

Keywords:

earthquakes; damaged buildings; feature sets; remote sensing images

1. Introduction

In recent years, natural disasters have increasingly inflicted severe losses of life and property globally [1]. Earthquakes, as one of the most unpredictable natural disasters, are particularly challenging to mitigate, with their widespread impacts and substantial destruction in urban areas [2,3,4]. For instance, the 2010 Yushu earthquake in Qinghai, China, resulted in over 2000 fatalities and injured more than 100,000 individuals. Notably, rescue efforts successfully extracted over 6000 survivors from the rubble [5]. The structural failure of buildings during seismic events frequently accounts for the majority of casualties and significant economic losses. In the context of Turkey’s February 2023 earthquakes, reports from the Turkish Enterprise and Business Federation indicate that over 84,000 buildings were either severely damaged or collapsed, contributing to estimated economic losses of approximately USD 70.8 billion and a reduction in national income by USD 10.4 billion [6]. The advancement of high-resolution remote sensing satellites has significantly improved the capability for ground information extraction. Consequently, satellite technology has become essential for large-scale monitoring and post-disaster assessments [7,8,9,10]. The strategic application of remote sensing techniques to evaluate urban damage after earthquakes is thus emerging as a critical area of focus in disaster remote sensing research [11,12,13].

The advent of deep learning techniques has bolstered the theoretical framework for intelligent remote sensing interpretation, enabling precise automatic recognition through the training of large and diverse datasets [14,15,16]. To advance research on intelligent interpretation algorithms in remote sensing for assessing building damage following disasters, Gupta et al. [17] introduced the xBD dataset, which spans multiple disaster types and categorizes building damage into four distinct levels, encompassing 384 images of earthquake-affected buildings. Zhe Dong applied deep learning to identify structures damaged by the 6.4-magnitude earthquake that struck Cangshan West Town, Yangbi County, Dali Prefecture, Yunnan Province, China on 21 May 2021, utilizing high-definition images from Google Earth and drones to compile a dataset of damaged buildings [18]. Furthermore, Yao Sun and colleagues have developed the SAR-Optical Dataset for Rapid Damage Building Detection, focusing on radar imagery. These datasets significantly enhance the research on algorithms for extracting data on buildings damaged by earthquakes. Nonetheless, the reliance on single satellite sources and specific resolutions limits the generalizability of the trained models, posing challenges to their broader application [19]. Additionally, most earthquake emergency datasets suffer from limited coverage areas and data volume, further complicating model training [20]. The variety of resolutions and satellite sources, along with divergent architectural styles and topography in different earthquake regions, present substantial challenges for training models effectively.

The extraction of features from damaged buildings represents a critical area of focus in post-disaster damage assessment. Historically, numerous scholars have successfully employed threshold techniques or machine learning approaches based on damaged building features to recognize and extract information pertaining to post-earthquake structural damage [21,22,23,24]. Fundamental image feature algorithms offer significant advantages, including robust interpretability and well-defined mathematical principles. Additionally, shallow image features can effectively augment deep learning algorithms by providing auxiliary recognition capabilities [25].

In response to the outlined challenges, this paper introduces the Multi-scale Earthquake Damaged Building Feature Set (MEDBFS), a comprehensive large-scale dataset. This feature set amalgamates various satellite remote sensing images and incorporates elements of the xBD dataset as its primary data sources, showcasing a diverse array of scales and resolutions from images collected across multiple significant earthquake events. Based on the visual characteristics of the acquired images, damaged buildings are systematically classified into three categories: intact/slightly damaged, severely damaged, and completely collapsed. The dataset constructs six distinct types of feature images, including spectral, texture, edge, building index, shadow, and temporal sequences, encompassing a total of 7062 images that depict buildings before and after sustaining damage.

2. Overview of Feature Sets

2.1. Data Source Overview

Data acquisition for this study encompasses five major earthquake-affected regions: Yushu, China, in 2010; Puebla and its environs, Mexico, in 2018; Kahramanmaraş and Gaziantep, Türkiye, in 2023; Marrakech and surrounding areas in Morocco; and Herat and surrounding areas in Afghanistan. Detailed specifics regarding each earthquake are presented in Table 1 [5,6,26,27,28]. The optical remote sensing images utilized were acquired promptly post-disaster, ensuring relevance for emergency response analysis. The feature set developed in this study leverages a variety of remote sensing satellite imagery sources, including QuickBird, Jilin-1, and WorldView, to enhance the robustness and comprehensiveness of the data collected.

The QuickBird satellites, launched by DigitalGlobe, Longmont, CO, USA, are advanced high-resolution optical remote sensing satellites that deliver multispectral images at a resolution of 2.44 m and panchromatic images at 0.61 m. The fusion of these images produces multispectral remote sensing images with an enhanced resolution of 0.61 m. The GF-2 satellites, developed independently by China Aerospace Science and Technology Corporation, Beijing, China, are high-resolution optical remote sensing satellites that provide images at resolutions as fine as 0.8 m at nadir. Additionally, the Jilin-1 satellites, also independently developed by Chang Guang Satellite Technology Co., Ltd, Changchun, JL, China, comprise a commercial remote sensing satellite constellation capable of revisiting any point on Earth between 35 and 37 times per day, which is instrumental for timely post-disaster data acquisition. The optical data from Jilin-1 utilized in this study have a resolution of 0.75 m. Furthermore, the WorldView satellites, operated by MAXAR, Westminster, CO USA, are high-resolution commercial remote sensing satellites that supply premium optical remote sensing images with a resolution of 0.3 m, pivotal for disaster response applications. The xBD dataset represents the first large-scale endeavor for post-disaster building damage assessment, encompassing up to 19 categories of post-disaster building damage images [17].

In this research, multispectral and panchromatic imagery from the 2010 China earthquake were sourced from the QuickBird satellite. Optical images for the 2023 Turkey and Afghanistan earthquakes were procured from the Jilin-1 satellites, while imagery of the 2023 Morocco earthquake was captured via the WorldView satellite. Additionally, image data from the 2018 Mexico earthquake, sourced from the xBD dataset which was provided by WorldView satellites, served as the primary data for this study. The distribution of the samples collected for each earthquake event is depicted in Figure 1, with comprehensive details regarding the data sources provided in Table 2.

2.2. Feature Set Overview

To facilitate the utilization of diverse image features for the classification of earthquake-damaged buildings, this paper introduces a feature set divided into three components: original pre- and post-disaster images, classification labels for damaged buildings, and feature maps of building damage. The original images, which are optical satellite remote sensing images, have been collected from five major earthquake-stricken regions, totaling 7062, each with dimensions of 512 × 512 pixels. The classification labels for the damaged buildings reflect multi-class outcomes based on a meticulous analysis by domain experts. This analysis involves a comparative assessment of the imagery before and after disasters, applying specific interpretive criteria to determine the extent and type of damage inflicted on the structures. Furthermore, this study has developed specific classification standards based on existing remote sensing interpretation protocols and the detectability of the samples at the utilized resolution.

The damaged building feature maps within this dataset encompass five categories of features: spectral, texture, edge, building index, and temporal, which are further divided into 16 distinct feature values. The feature set is organized into five compressed packages based on the earthquake-affected region. Each compress archive comprises three folders—Image, Label, and Feature—and one explanatory text file. Within the Image folder, pre-disaster and post-disaster images are stored as ‘tiffpre’ and ‘tiffpost’, respectively. The Label folder contains data where ‘0’ indicates no obvious damage and non-building area, ‘1’ denotes severe damage, and ‘2’ signifies complete collapse. The Feature folder houses a total of 16 feature files. All images are in TIFF format, and each package is accompanied by a text file that provides detailed descriptions of its contents. The organizational structure of the feature set, exemplified by the China earthquake, is depicted in Figure 2.

3. Construction of Damaged Building Sample Set

3.1. Data Preprocessing

Raw data collected in the aftermath of disaster emergencies typically necessitate preprocessing to ensure their applicability and validity. To this end, this study undertook specific preprocessing measures on the original data to validate their integrity and utility for subsequent analyses.

To address the variability in imaging quality and cloud cover that significantly affect the identification of damaged buildings, this paper conducts preliminary screening of the original pre- and post-disaster image data sourced from diverse origins. This process ensures that only high-quality images are retained for further analysis. For images available in both multispectral and panchromatic bands, a fusion technique is employed to enhance the resolution of multispectral images, thereby improving their recognizability. Significant discrepancies exist in the grayscale values and absolute radiance between original image datasets. Standardization issues arise when comparing images produced by different sensors at varying times. To address this, the study performs radiometric calibration on the collected remote sensing images, converting visible light reflectance data into standardized units using specific formulas, which enhance the data’s readability and credibility. Furthermore, although atmospheric effects on visible light are generally minimal, aerosols and water vapor can still impact light transmission. Accordingly, this paper implements atmospheric correction techniques on optical remote sensing data to refine image quality.

Original datasets may exhibit geographical inaccuracies that lead to coordinate misalignment between pre- and post-disaster remote sensing images. Orthorectification of visible light images substantially enhances geographical positioning accuracy, mitigates terrain-induced distortions, and facilitates the subsequent registration of dual-time images. Despite these improvements, discrepancies in detail between dual-time images persist following orthorectification. Consequently, it is imperative to conduct geographic matching on the pre- and post-disaster datasets to ensure positional consistency across the images.

In instances where the resolutions of dual-time images differ, this study implements raster downsampling on high-resolution remote sensing images to achieve uniform resolution, thereby ensuring pixel-level correspondence between the images. Concurrently, to optimize inputs for subsequent deep learning analyses, this paper systematically crops images to a uniform size of 512 × 512 pixels. During this cropping process, sample images lacking damaged building representations are excluded, thus maintaining a balance between intact and damaged building datasets. The detailed workflow of these procedures is depicted in Figure 3.

3.2. Ground Truth Annotation

To facilitate the application of the feature set in deep learning contexts, this paper integrates actual image interpretation features with established international standards and annotates damaged buildings with ground truth classifications. The European Macroseismic Scale (EMS98), initially proposed by European scholars in 1998, categorizes earthquake-induced building damage into five levels: slight, moderate, severe, very severe, and destruction [29]. However, the EMS98 classification, originally based on ground surveys, may not adequately distinguish damage levels in remote sensing images. Consequently, Gupta et al. [17] introduced the xBD dataset, which revises the classification of building damage into four distinct levels: no damage, slight damage, severe damage, and destroyed, better suiting the nuances of remote sensing analysis. See Table 3 for detailed classification rules.

This study integrates the building damage classification from EMS-98 with the visual interpretation criteria for remote sensing assessment of building damage as proposed by Gupta et al. Given the limited discernibility of low-resolution satellite images, such as those from QuickBird, for detecting slight damage, and considering the relatively lower risk posed by slight damage to human life and property safety, this paper proposes a revised classification of post-earthquake building damage into three categories: intact/slight damage, severe damage, and complete collapse. This classification leverages spectral, texture, geometric, and shadow features extracted, comparing pre- and post-disaster images. The specifications of these features are detailed in Table 4.

During the annotation process, rigorous quality control measures are implemented. Initial annotations are performed by laboratory personnel, followed by a secondary review conducted by professional technical staff. Annotations are meticulously based on comparisons with pre-disaster images, and critical inspections of building damage are undertaken in areas reported to be severely affected post-disaster to ensure the authenticity and validity of the annotations. The classification criteria for collapsed buildings are delineated in Table 3. Figure 4 provides an example of the ground truth annotation process.

To ascertain the efficacy of our data annotation protocol, we employed the FTN network training dataset [30]. The FTN network utilizes a Siamese structure on the encoding side to augment dual-phase feature extraction, with the swin-transformer serving as the foundational backbone of this structure. The feature fusion stage incorporates both feature summation and difference to amplify multi-level visual features. Enhancing the representational capacity of the memory, the decoding end incorporates the Progressive Attention Module (PAM) to construct a pyramid structure. Moreover, deep supervised learning is applied to optimize the training outcomes. The FTN networks have demonstrated robust performance across various change detection datasets. However, during our experiments, we observed that convolution in the encoding layer yielded superior results compared to the swin-transformer on our dataset. Consequently, we substituted the swin-transformer with ConvNeXt to enhance the training efficacy [31].

The division ratio of the training set to the validation set Is established at 8:2. To ensure an equitable distribution between training and validation samples, 20% of the validation sets were randomly selected from diverse geographical regions.

In our experiments, we employed two backbones, ConvNeXt-base and ConvNeXt-small, respectively. The specific training curve for our dataset is depicted in Figure 5, while the training F1-score and validation F1-score are presented in Figure 6. In our experiments, cross-entropy loss was selected as the training loss function. The formula for the cross-entropy loss function is as follows. In the formula, L represents the value of the loss function. N denotes the number of samples. Y_i is the true label of the i-th sample (for binary classification problems, typically represented by 0 or 1). P_i is the probability predicted by the model that the i-th sample belongs to the positive class (i.e., the probability value output by the model):

L = - \frac{1}{N} \sum_{i = 1}^{N} [y_{i} \cdot l o g (p_{i}) + (1 - y_{i}) \cdot l o g (1 - p_{i})]

(1)

We utilized the F1 Score and Mean Intersection over Union (MioU) derived from multiple categories to gauge the accuracy of the training outcomes. In machine learning, prediction outcomes can be evaluated by a confusion matrix. As shown in Table 5, the confusion matrix depicts the performance of the classification model. In this matrix, True Positive (TP) denotes the number of samples that are positive both as actual and predicted values, False Positive (FP) presents the number of samples falsely predicted as positive while being negative, False Negative (FN) indicates the number of samples falsely predicted as negative while being positive, and True Negative (TN) represents the number of samples that are true both as actual and predicted negative values. Based on this matrix, scholars have developed metrics such as Precision and Recall. Considering our dataset contains three categories, we treat the task as three binary classifications to compute the confusion matrix for each category:

P r e c i s i o n = \frac{T P}{T P + F P}

(2)

R e c a l l = \frac{T P}{T P + F N}

(3)

The F1 Score is the harmonic mean of precision and recall, and the specific calculation formula is as follows:

F 1 S c o r e = 2 \cdot \frac{P r e c i s i o n \cdot R e c a l l}{P r e c i s i o n + R e c a l l}

(4)

Mean Intersection over Union (MioU) is a statistical metric used extensively in the evaluation of object detection and segmentation models, particularly in the fields of computer vision and image processing. The formula is as follows (in the formula, n represents the number of categories):

M I o U = \frac{1}{n} \sum_{i = 1}^{n} \frac{T P_{i}}{T P_{i} + F P_{i} + F N_{i}}

(5)

Both the F1 Score and MioU are indicative of the overall category discrimination accuracy. The detailed training results are tabulated in Table 6.

Experimental results indicate that our annotated dataset achieves a peak validation F1-score of 74% on the FTN network, demonstrating a robust performance in the recognition of collapsed buildings. Furthermore, as illustrated in Figure 7, the model effectively discriminates between various conditions of building damage.

4. Construction of Damaged Building Feature Set

This paper focuses on the extraction of shallow features from diverse image types based on fundamental optical remote sensing data, thereby furnishing shallow feature datasets essential for machine learning and deep learning training. The feature set delineated herein encompasses five categories: spectral, texture, edge, index, and temporal features. Spectral features are categorized into red, green, and blue components. Texture features encompass contrast, energy, correlation, entropy, and angular second-moment gray-level texture features calculated via the Gray-Level Co-occurrence Matrix (GLCM) [32], as well as texture features depicting grayscale variation in the frequency domain, derived using the Local Binary Pattern (LBP) technique [33]. Edge features include those derived from Prewitt and Laplace operators [34,35]. Building index features utilize the Multi-Band Index (MBI) for feature representation. Shadow features are quantified using shadow index calculations. Temporal features involve 3D texture features computed through a 3D Gray-Level Co-occurrence Matrix [36]. The comprehensive composition of the feature set is depicted in Figure 8.

4.1. Spectral Characteristics

Spectral features, fundamental to the characterization of images, demonstrate varied patterns of spectral absorption, reflection, and radiation across different terrestrial objects. The specific material composition of buildings dictates their unique spectral reflective properties. Post-earthquake, collapsed buildings often expose internal materials, manifesting spectral characteristics that markedly distinguish them from structures that remain intact [31]. In this study, we extract the red, green, and blue spectral components from post-disaster optical images to serve as spectral features, as depicted in Figure 9b–d.

4.2. Texture Characteristics

Texture features are adept at capturing the grayscale variation patterns within remote sensing images, effectively extracting spatial distribution and structural information of objects. These features play a pivotal role in various applications, including remote sensing image classification, target detection, and change detection. Earthquake-induced damage disrupts the regular texture patterns of buildings, resulting in significant alterations to their texture features. In this study, the Gray-Level Co-occurrence Matrix (GLCM) and Local Binary Pattern (LBP) are utilized to extract texture features from post-disaster images, thereby providing crucial support for the development of damage detection tasks.

4.2.1. Gray Level Co-Occurrence Matrix

Gray-level co-occurrence matrix (GLCM) is a grayscale statistical method used to describe texture features. Each element in the matrix represents the frequency of occurrence of specific grayscale combinations at a certain direction and distance. The specific calculation method is as follows: (1) Select a direction θ and a distance d; (2) Traverse all pixels, record the frequency of occurrence of pixel grayscale levels i and j as pairs of grayscale values at the selected direction θ with adjacent distance d; (3) Summarize the frequency of occurrence of all pixel grayscale level pairs to form the gray-level co-occurrence matrix. We use a 5 × 5 window to slide the post-disaster pictures to generate a gray-level co-occurrence matrix. The texture feature values can be calculated using the gray-level co-occurrence matrix as follows (in Formulas (5)–(9), i and j represent different gray levels. P(i, j) represents the frequency of occurrence of a gray level combination):

Angular Second Moment (ASM), also referred to as energy, is employed to describe the uniformity of grayscale distribution and the fineness of texture within images. A higher ASM value indicates a more uniform and distinct image texture, whereas a lower ASM value suggests less uniformity. The ASM feature map is depicted in Figure 10b, and the specific calculation formula is as follows:

A S M = {\sum_{i} \sum_{j} P (i, j)}^{2}

(6)

Entropy (ENT) is a metric used to quantify the randomness of the information content within an image. It correlates positively with the randomness of grayscale information and the complexity of the image’s detailed texture. The higher the entropy, the more complex and less predictable the image texture. The entropy feature map is illustrated in Figure 10c, and the specific methodology for its calculation is as follows:

E N T = - \sum_{i} \sum_{j} P (i, j) \log (P (i, j))

(7)

Contrast (Con) serves as an indicator of local variations within images, encapsulating the depth of grooves and wrinkles in the texture. This metric directly correlates with the clarity of the image texture, providing insights into the texture’s sharpness and variation. The contrast feature map is displayed in Figure 10d, and the specific formula for its calculation is as follows:

C o n = {\sum_{i} \sum_{j} (i - j)}^{2} \times P (i, j)

(8)

Inverse Differential Moment (IDM), also known as inverse variance, quantifies the rate of change in differences between distinct grayscale levels within an image. This metric inversely correlates with the magnitude of local grayscale variations—the smaller the variation, the higher the IDM value. This relationship underscores the homogeneity of the image texture. The IDM feature map is illustrated in Figure 10e, and the specific formula for its calculation is as follows:

I D M = \sum_{i} \sum_{j} \frac{p (i, j)}{{1 + (i - j)}^{2}}

(9)

Correlation (Cor) is employed to assess the similarity among grayscale levels along the row and column axes of an image. This feature directly captures the local grayscale correlation, wherein a stronger grayscale correlation corresponds to higher values of the correlation feature. The correlation feature map is depicted in Figure 10f. The specific calculation formula—where

μ_{i}

and

μ_{j}

represent the mean values in the row and column directions, and

σ_{i}

and

σ_{j}

represent the standard deviations in the row and column directions—is as follows:

C o r = \sum_{i} \sum_{j} \frac{(i - μ_{j}) (j - μ_{j}) \times {p (i, j)}^{2}}{σ_{i} σ_{j}}

(10)

4.2.2. Local Binary Pattern

The Local Binary Pattern (LBP) is a texture descriptor extensively utilized in the field of computer vision. It fundamentally employs the pixel intensity values surrounding a central pixel to sample and generate a binary pattern, which effectively encapsulates the texture characteristics of the local area. For this study, LBP was extracted from post-disaster images using the following specific computational steps:

Select the 8-neighborhood region around the central pixel.
Compare each neighboring pixel to the central pixel: assign a value of 1 if the neighboring pixel’s intensity is greater than that of the central pixel, and 0 if less.
Formulate a binary number by arranging the assigned values clockwise based on the comparison results.
Convert the binary number to a decimal number as the texture feature value of the current pixel region.

An example of the Local Binary Pattern feature value is shown in Figure 10g.

4.3. Edge Characteristics

Edge features denote the linear characteristics of contour lines, which are discerned through a comprehensive series of image processing methods designed to detect transitions in grayscale values. In intact buildings, edge features typically manifest a regular pattern within the image. However, in the aftermath of an earthquake, the structural form of buildings experiencing severe damage or collapse is significantly altered, manifesting distinct changes in edge features when compared to pre-earthquake images. Consequently, the extraction of edge features is essential for identifying and analyzing the extent of damage in post-earthquake buildings. In this study, we employ the Prewitt operator and the Laplacian operator to extract edge features from images of earthquake-damaged buildings [33,34]. Illustrations of these edge features are presented in Figure 11b,c.

The Prewitt operator, recognized as a straightforward yet effective edge detection tool, computes local grayscale variations by employing first-order differential values. This operator applies a threshold to identify grayscale transition points, thereby pinpointing edge locations effectively. The operator’s template is formalized as Equation (6). We use a 3 × 3 window to perform sliding multiplication and sum of

d_{x}

and

d_{y}

on post-disaster images. The two values are averaged to obtain the final Prewitt edge strength as follows:

d_{x} = [\begin{matrix} - 1 & 0 & + 1 \\ - 1 & 0 & + 1 \\ - 1 & 0 & + 1 \end{matrix}], d_{y} = [\begin{matrix} - 1 & - 1 & - 1 \\ 0 & 0 & 0 \\ + 1 & + 1 & + 1 \end{matrix}]

(11)

The Laplacian operator, a second-order differential operator, operates fundamentally on the principle of calculating the rate of change across image pixels in various directions using the second-order partial derivatives of the image. This approach facilitates the extraction of edge features from the image. In this study, we employ the eight-neighborhood Laplacian operator to derive edge features from remote sensing images. The specific operator template is delineated in Equation (7). We obtain the Laplace edge intensity through the sliding product and sum operation of the post-disaster image and the operator

H

H = [\begin{matrix} - 1 & - 1 & - 1 \\ - 1 & 8 & - 1 \\ - 1 & - 1 & - 1 \end{matrix}]

(12)

4.4. Building Index Characteristics

The Morphological Building Index (MBI) serves as a quantitative descriptor for urban architectural morphological features [32]. This index leverages specific spectral properties combined with a sequence of morphological operations to extract pertinent building information. Earthquakes impart substantial damage to the morphological characteristics of structures, which is reflected in alterations to the Morphological Building Index in images of destroyed buildings. In this study, the MBI is utilized to extract characteristics of buildings damaged by earthquakes. The methodological approach involves the following steps: 1. Select the maximum pixel value in the optical image for analysis; 2. Perform morphological white top-hat reconstruction on the resultant image; 3. Calculate the Differential Morphological Profiles (DMP); 4. Compute the Morphological Building Index (MBI). The outcomes of this process are illustrated in Figure 12b.

4.5. Time Characteristics

In this study, the 3D Gray-Level Co-occurrence Matrix (3DGLCM) was employed to capture the temporal features for image computation and extraction. Earthquakes induce noticeable spectral and textural variations in damaged buildings, discernible in dual-temporal images. The application of 3D gray-level co-occurrence matrices facilitates the effective extraction of these variations and their integration into machine learning tasks [19]. Conventionally, 2D gray-level co-occurrence matrices statistically assess the probability of adjacent gray levels along a specific direction, as illustrated in Figure 13a. These matrices are capable of being computed in multiple directions on a two-dimensional plane, as shown in Figure 13b. The 3D gray-level co-occurrence matrix extends this concept by aggregating pixel values between dual-temporal images for comprehensive statistical analysis, depicted in Figure 13c. In the proposed feature set, texture feature values are derived using 3D gray-level co-occurrence matrices, with the features subsequently formatted into raster images to capture 3D temporal texture characteristics. The results of these generated feature values are displayed in Figure 14c–f.

5. Sources of Data Noise

In the feature set we propose, a small subset of the post-disaster data is influenced by cloud and fog, which obscures the imagery of both intact and damaged buildings, resulting in the loss of architectural features and the introduction of noise across various feature maps. Furthermore, in the acquisition of pre-disaster data for the Moroccan earthquake, we encountered variability in the timing and quality of image sources, leading to some data being completely obscured by clouds and fog. This inconsistency has substantially impacted the computations involving the 3D Gray-Level Co-occurrence Matrix. In machine learning tasks, if there is a problem with poor convergence of machine learning models, I suggest excluding those datasets that are affected by clouds. To facilitate the exclusion of pre-disaster data influenced by clouds from research applications, we have cataloged these data in a TXT document. However, if experiments are conducted using only post-disaster data, this issue may be disregarded.

6. Conclusions

The interpretation of building damage through remote sensing is critically important for post-earthquake emergency response and other related applications. In this study, optical images produced by multiple remote sensing satellites were selected, amassing a comprehensive dataset of 7062 pre- and post-disaster images. This dataset was utilized to construct a large-scale feature set aimed at analyzing buildings damaged by multiple major earthquakes. With regard to data annotation, this study classifies post-disaster buildings into three categories and provides annotations for images showing severe damage and complete collapse. For feature generation, the study developed a feature set that includes five branches, encompassing a total of 16 feature values. To aid researchers, each feature image was resized to 512 × 512 pixels, facilitating straightforward integration into subsequent deep learning algorithms. This paper announces the public release of the Multi-scale Earthquake Damaged Building Feature Set (MEDBFS), actively encouraging contributions from the research community to enhance this dataset. Our team is committed to continuous improvement of the feature set in forthcoming studies. Ultimately, it is anticipated that the availability of this openly accessible feature set will propel advancements in the field of remote sensing-based post-disaster building detection.

Author Contributions

Conceptualization, F.W. and L.W.; methodology, G.G. and Z.W. ; software, G.Q.; validation, Q.Z. and G.G.; resources, J.Z. and W.L.; writing-original draft preparation, G.G.; writing-review and editing, Y.H.; supervision, F.W.; funding acquisition F.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Key R&D Program of China (2021YFC1523501, 2021YFB3901201, 2022YFC3006400).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All the datasets and feature sets in this article can be obtained at the following github link: (https://github.com/ziwen-hash/MEDBFS) (accessed on 22 June 2024), some data are temporarily inconvenient to disclose due to the progress of the project, but all data disclosure will be completed after the project is completed.

Acknowledgments

Finally: we express our gratitude to the xBD dataset for providing a portion of the seismic image dataset. We also acknowledge the Gaofen-2, Jilin-1, and Quick Bird satellites for supplying pre- and post-disaster imagery. Additionally, we are thankful for MAXAR’s publicly available high-resolution seismic imagery, which served as foundational data for the construction of this feature set and has made significant contributions to seismic disaster research.

Conflicts of Interest

The authors declare no conflict of interest.

References

United Nations Office for Disaster Risk Reduction. Terminology on Disaster Risk Reduction. UNDRR.org. United Nations, n.d. Web. Available online: https://www.undrr.org/terminology/disaster (accessed on 23 May 2024).
Hou, H. Earthquake disaster prevention system of earthquake disaster prevention and reduction work system. Eng. Technol. Res. 2019, 23, 234–235. [Google Scholar] [CrossRef]
Shohet, I.; Wei, H.; Shapira, S.; Robert, L.; Aharonson-Daniel, L.; Levi, T.; Salamon, A.; Levy, R.; Levi, O. Analytical-Empirical Model for the Assessment of Earthquake Casualties and Injuries in a Typical Israeli City; Final Report; Department of Structural Engineering, Ben-Gurion University: Beer Sheva, Israel, 2015. [Google Scholar]
Luino, F.; Barriendos, M.; Gizzi, F.T.; Glaser, R.; Gruetzner, C.; Palmieri, W.; Porfido, S.; Sangster, H.; Turconi, L. Historical Data for Natural Hazard Risk Mitigation and Land Use Planning. Land 2023, 12, 1777. [Google Scholar] [CrossRef]
Ni, S.D.; Wang, W.T.; Li, L. The April 14th, 2010 Yushu earthquake, a devastating earthquake with foreshocks. Sci. China Earth Sci. 2010, 53, 791. [Google Scholar] [CrossRef]
Wang, T.; Chen, J.; Zhou, Y.; Wang, X.; Lin, X.; Wang, X. Preliminary investigation of building damage in Hatay under February 6, 2023 Turkey earthquakes. Earthq. Eng. Eng. Vib. 2023, 22, 853–866. [Google Scholar] [CrossRef]
Booth ESaito, K.; Spence, R.; Madabhushi, G.; Eguchi, R. Validating assessments of seismic damage made from remote sensing. Earthq. Spectra 2011, 27, 157–177. [Google Scholar] [CrossRef]
Brown, D.; Saito, K.; Liu, M.; Spence, R.; So, E.; Ramage, M. The use of remotely sensed data and ground survey tools to assess damage and monitor early recovery following the 12.5.2008 Wenchuan earthquake in China. Bull. Earthq. Eng. 2011, 10, 741–764. [Google Scholar] [CrossRef]
Yan, C.; Bo, L.; Wei, Z.; Feng, X.; Juan, N. Interpretation of remote sensing images of house damage caused by the Ya’an earthquake. Spacecr. Eng. 2014, 5, 129–134. [Google Scholar]
Gu, J.; Xie, Z.; Zhang, J.; He, X. Advances in Rapid Damage Identification Methods for Post-Disaster Regional Buildings Based on Remote Sensing Images: A Survey. Buildings 2024, 14, 898. [Google Scholar] [CrossRef]
Deng, L.; Wang, Y. Post-disaster building damage assessment based on improved U-Net. Sci. Rep. 2022, 12, 15862. [Google Scholar] [CrossRef]
Weber, E.; Kané, H. Building disaster damage assessment in satellite imagery with multi-temporal fusion. arXiv 2020, arXiv:2004.05525. [Google Scholar] [CrossRef]
Corbane, C.; Saito, K.; Dell’Oro, L.; Bjorgo, E.; Gill, S.P.; Emmanuel Piard, B.; Huyck, C.K.; Kemper, T.; Lemoine, G.; Spence, R.J.; et al. A Comprehensive Analysis of Building Damage in the 12 January 2010 MW 7 Haiti Earthquake using High-Resolution Satellite and Aerial Imagery. Photogramm. Eng. Remote Sens. 2011, 77, 997–1009. [Google Scholar] [CrossRef]
Aali, H.; Sharifi, A.; Malian, A. Earthquake damage detection using satellite images (case study: Sarpol-zahab earthquake). ISPRS—Int. Arch. Photogramm. Remote. Sens. Spat. Inf. Sci. 2019, 42, 1–5. [Google Scholar] [CrossRef]
Huang, H.; Sun, G.; Zhang, X.; Hao, Y.; Zhang, A.; Ren, J.; Ma, H. Combined multiscale segmentation convolutional neural network for rapid damage mapping from postearthquake very high-resolution images. J. Appl. Remote. Sens. 2019, 13, 022007. [Google Scholar] [CrossRef]
Song, D.; Tan, X.; Wang, B.; Zhang, L.; Shan, X.; Cui, J. Integration of super-pixel segmentation and deep-learning methods for evaluating earthquake-damaged buildings using single-phase remote sensing imagery. Int. J. Remote. Sens. 2019, 41, 1040–1066. [Google Scholar] [CrossRef]
Gupta, R.; Goodman, B.; Patel, N.; Hosfelt, R.; Sajeev, S.; Heim, E.; Doshi, J.; Lucas, K.; Choset, H.; Gaston, M. Creating xBD: A dataset for assessing building damage from satellite imagery. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 16–17 June 2019; pp. 10–17. [Google Scholar]
Dong, Z.; Wang, W.; Li, L.; Luo, W.; Wu, Z. Construction of a sample dataset for earthquake-damaged buildings based on post-disaster unmanned aerial vehicle remote sensing images. Artif. Intell. Robot. Res. 2022, 11, 227–235. [Google Scholar] [CrossRef]
Sun, Y.; Wang, Y.; Eineder, M. QuickQuakeBuildings: Post-earthquake SAR-Optical Dataset for Quick Damaged-building Detection. arXiv 2023, arXiv:2312.06587. [Google Scholar] [CrossRef]
Foulser-Piggott, R.; Spence, R.; Saito, K.; Brown, D.M.; Eguchi, R. The Use of Remote Sensing for Post-Earthquake Damage Assessment: Lessons from Recent Events, and Future Prospects. In Proceedings of the Fifthteenth World Conference on Earth-quake Engineering, Lisbon, Portugal, 24–28 September 2012; p. 10. [Google Scholar]
Guo, H.; Lu, L.; Ma, J.; Martino, P.; Yuan, F. An improved automatic identification method of remote sensing information of collapsed houses in earthquake disasters. Bull. Sci. 2009, 17, 2581–2585. [Google Scholar]
Miura, H.; Modorikawa, S.; Chen, S.H. Texture characteristics of high-resolution satellite images in damaged areas of the 2010 Haiti earthquake. In Proceedings of the 9th International Workshop on Remote Sensing for Disaster Response, Stanford, CA, USA, 15–16 September 2011; pp. 15–16. [Google Scholar]
Fu, B. Research on Seismic Damage Identification of Buildings Based on UAV Orthophotos. Master’s Thesis, Institute of Geology, China Earthquake Administration, Beijing, China, 2018. [Google Scholar]
Deng, Y.; Lu, W. Extraction of remote sensing information of post-earthquake collapsed houses based on gray-scale co-occurrence matrix—Taking the 2014 6.5-magnitude earthquake in Ludian, Yunnan as an example. South China Earthq. 2019, 2, 100–111. [Google Scholar] [CrossRef]
Ji, M.; Liu, L.; Du, R.; Buchroithner, M.F. A comparative study of texture and convolutional neural network features for detecting collapsed buildings after earthquakes using pre- and post-event satellite imagery. Remote Sens. 2019, 11, 1202. [Google Scholar] [CrossRef]
Ramírez-Herrera, M.T.; Corona, N.; Ruiz-Angulo, A.; Melgar, D.; Zavala-Hidalgo, J. The 8 September 2017 tsunami triggered by the M w 8.2 intraplate earthquake, Chiapas, Mexico. Pure Appl. Geophys. 2018, 175, 25–34. [Google Scholar] [CrossRef]
Errazzouki, S. Between rubble and rage: Reflections on the earthquake in Morocco and its aftermath. J. N. Afr. Stud. 2024, 29, 197–205. [Google Scholar] [CrossRef]
Rasheed, R.; Chen, B.; Wu, D.; Wu, L. A Comparative Study on Multi-Parameter Ionospheric Disturbances Associated with the 2015 Mw 7.5 and 2023 Mw 6.3 Earthquakes in Afghanistan. Remote. Sens. 2024, 16, 1839. [Google Scholar] [CrossRef]
Grunthal, G. European Macroseismic Scale 1998; European Seismological Commission (ESC): Potsdam, Germany, 1998. [Google Scholar]
Yan, T.; Wan, Z.; Zhang, P. Fully transformer network for change detection of remote sensing images. In Proceedings of the Asian Conference on Computer Vision, Macao, China, 4–8 December 2022; pp. 1691–1708. [Google Scholar]
Woo, S.; Debnath, S.; Hu, R.; Chen, X.; Liu, Z.; Kweon, I.S.; Xie, S. Convnext v2: Co-designing and scaling convnets with masked autoencoders. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 16133–16142. [Google Scholar]
Haralick, R.; Shanmugam, K.; Dinstein, I. Textural Features for Image Classification. IEEE Trans. Syst. Man Cybern. 1973, SMC-3, 610–621. [Google Scholar] [CrossRef]
Ojala, T.; Pietikäinen, M.; Harwood, D. A comparative study of texture measures with classification based on featured distributions. Pattern Recognit. 1996, 29, 51–59. [Google Scholar] [CrossRef]
Prewitt, J.M. Object enhancement and extraction. Pict. Process. Psychopictorics 1970, 10, 15–19. [Google Scholar]
Jähne, B. Digital Image Processing; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2005. [Google Scholar]
Moya, L.; Zakeri, H.; Yamazaki, F.; Liu, W.; Mas, E.; Koshimura, S. 3D gray level co-occurrence matrix and its application to identifying collapsed buildings. ISPRS J. Photogramm. Remote. Sens. 2019, 149, 14–28. [Google Scholar] [CrossRef]

Figure 1. Proportion of pre- and post-disaster image samples for each earthquake event.

Figure 2. Folders’ structure, taking the Chinese earthquake as an example.

Figure 3. Image preprocessing process.

Figure 4. Example of ground truth value. (a) China earthquake, (b) Turkey earthquake, (c) Afghanistan earthquake.

Figure 5. Training loss curve of ConvNeXt-FTN. (a) ConvNeXt-small-FTN; (b) ConvNeXt-base-FTN.

Figure 6. Training and validation F1-score curve of ConvNeXt-FTN. (a) ConvNeXt-small-FTN; (b) ConvNeXt-base-FTN.

Figure 7. Example of ground truth value. (a) Afghanistan earthquake, (b) Morocco earthquake, (c) China earthquake.

Figure 8. Construction of damaged building features.

Figure 9. Spectral feature set example of China earthquake. (a) Post-disaster optical image; (b) Red feature; (c) Green feature; (d) Blue feature.

Figure 10. Texture feature set example of China earthquake. (a) Post-disaster optical image; (b) ASM feature; (c) Entropy feature; (d) Contrast features; (e) IDM feature; (f) Correlation feature; (g) LBP features.

Figure 11. Edge feature set example of China earthquake. (a) Post-disaster optical image; (b) Prewitt edge feature; (c) Laplacian edge feature.

Figure 12. MBI feature set example of China earthquake. (a) Post-disaster optical image; (b) MBI feature.

Figure 13. Schematic diagram of pixel relationship of gray level co-occurrence matrix. (a) Single direction GLCM; (b) Multiple directions GLCM; (c) 3D GLCM.

Figure 14. 3D texture feature set example of China earthquake. (a) Pre-disaster optical image; (b) Post-disaster optical image (c) 3D ASM feature; (d) 3D Entropy feature; (e) 3D Contrast feature; (f) 3D IDM feature; (g) 3D Correlation feature.

Table 1. Earthquake details.

Earthquake	Date	Epicenter Location	Magnitude	Image Coverage Area
China Earthquake	14 April 2010	33°03′11″ N, 96°51′26″ E	7.1	Yushu City, Qinghai Province, China
Mexico Earthquake	8 September 2017	34°42′00″ N, 61°54′00″ E	8.2	Puebla and surrounding areas, Mexico
Turkey Earthquake	6 February 2023	37°09′00″ N, 36°57′00″ E	7.8	Kahlamamarash and Gaziantep, Türkiye
Morocco Earthquake	8 September 2023	31°00′00″ N, 8°33′00″ W	6.9	Marrakech and surrounding areas, Morocco
Afghanistan Earthquake	8 October 2023	34°42′00″ N, 61°54′00″ E	6.2	Herat and surrounding areas, Afghanistan

Table 2. Data source details.

Earthquake	Satellite/Dataset	Resolution	Number of Images
China Earthquake	Quick Bird	0.61 m	1132
Mexico Earthquake	xBD	0.5 m	1448
Turkey Earthquake	Jilin-1	0.75 m	634
Morocco Earthquake	World View3	0.3 m	2972
Afghanistan Earthquake	Jilin-1\GF-2	0.75 m	876

Table 3. Damaged building marking standards of xBD.

Disaster Level	Structure Description
0 (No Damage)	Undisturbed. No sign of water, structural or shingle damage, or burn marks.
1 (Minor Damage)	Building partially burnt, water surrounding structure, volcanic flow nearby, roof elements missing, or visible cracks.
2 (Major Damage)	Partial wall or roof collapse, encroaching volcanic flow, or surrounded by water/mud.
3 (Destroyed)	Scorched, completely collapsed, partially/completely covered with water/mud, or otherwise no longer present.

Table 4. Damaged building marking standards.

Damage Level	Spectral Features	Texture Features	Geometric Features	Features Shadow Features
Intact/Slight Damage	Uniform grayscale, no significant spot changes, no bright debris around the building	Uniform image structure, uniform tone, natural transition	Regular geometric shape, clear building boundaries	Regular shadows, no obvious interruptions, clear and distinguishable
Severe Damage	Lighter tone, forming light spot-like debris between regularly arranged buildings	Relatively uniform image structure, chaotic texture on roof parts	Relatively clear building boundaries, partial defects in boundary lines, slight overall displacement	Shadow contours visible, some irregularities
Complete Collapse	Covered by rubble, rubble forms light-toned areas in the image, overall reflectivity increases after complete collapse	Chaotic and irregular image structure, blurred and rough, collapsed buildings blend with bare soil in the background, uneven grayscale	Obvious destruction or disappearance of regular building shapes; all building boundary lines are blurred or building tones blend with surrounding ground, collapsed roofs, damaged into fragments	Shadows are barely distinguishable

Table 5. Confusion matrix.

	Predicted Positive	Predicted Negative
Actual Positive	True Positive (TP)	False Negative (FN)
Actual Negative	False Positive (FP)	True Negative (TN)

Table 6. Validate experimental results.

Network	F1-Score	MIoU	FWIoU
ConvNeXt-small-FTN	74.73	70.65	97.39
ConvNeXt-base-FTN	74.43	74.16	97.56

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gao, G.; Wang, F.; Wang, Z.; Zhao, Q.; Wang, L.; Zhu, J.; Liu, W.; Qin, G.; Hou, Y. Multi-Scale Earthquake Damaged Building Feature Set. Data 2024, 9, 88. https://doi.org/10.3390/data9070088

AMA Style

Gao G, Wang F, Wang Z, Zhao Q, Wang L, Zhu J, Liu W, Qin G, Hou Y. Multi-Scale Earthquake Damaged Building Feature Set. Data. 2024; 9(7):88. https://doi.org/10.3390/data9070088

Chicago/Turabian Style

Gao, Guorui, Futao Wang, Zhenqing Wang, Qing Zhao, Litao Wang, Jinfeng Zhu, Wenliang Liu, Gang Qin, and Yanfang Hou. 2024. "Multi-Scale Earthquake Damaged Building Feature Set" Data 9, no. 7: 88. https://doi.org/10.3390/data9070088

Article Menu