Open AccessArticle

The Motion Estimation of Unmanned Aerial Vehicle Axial Velocity Using Blurred Images

Yedong Mao

^1,†,

Quanxi Zhan

^2,†,

Linchuan Yang

Chunhui Zhang

¹,

Ge Xu

¹ and

Runjie Shen

^2,*

Technology & Research Center, China Yangtze Power Co., Ltd., Yichang 443002, China

College of Electronic and Information Engineering, Tongji University, Shanghai 201804, China

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Drones 2024, 8(7), 306; https://doi.org/10.3390/drones8070306

Submission received: 11 May 2024 / Revised: 3 July 2024 / Accepted: 4 July 2024 / Published: 8 July 2024

(This article belongs to the Special Issue Advanced Unmanned System Control and Data Processing)

Download

Browse Figures

Versions Notes

Abstract

This study proposes a novel method for estimating the axial velocity of unmanned aerial vehicles (UAVs) using motion blur images captured in environments where GPS signals are unavailable and lighting conditions are poor, such as underground tunnels and corridors. By correlating the length of motion blur observed in images with the UAV’s axial speed, the method addresses the limitations of traditional techniques in these challenging scenarios. We enhanced the accuracy by synthesizing motion blur images from neighboring frames, which is particularly effective at low speeds where single-frame blur is minimal. Six flight experiments conducted in the corridor of a hydropower station demonstrated the effectiveness of our approach, achieving a mean velocity error of 0.065 m/s compared to ultra-wideband (UWB) measurements and a root-mean-squared error within 0.3 m/s. The results highlight the stability and precision of the proposed velocity estimation algorithm in confined and low-light environments.

Keywords:

motion blur image; axial velocity estimation; corridor; UAV; poor-light scene

1. Introduction

For a special tunnel environment, it can be regarded as a simple axial tensile structure with a vertical section. When the internal scene environment is relatively simple, the axis itself cannot provide the geometric information such as the tangent and curvature. The existing methods, such as the commonly used laser radar point clouds [1] or binocular vision [2], cannot effectively estimate the speed of the UAV along the axis.

The key issue is that, when the axis itself is a straight line, the point cloud information or picture information provided by adjacent frames cannot provide a good position estimation. For the point cloud information, the point clouds of the pipe wall in adjacent frames are similar; for environments with similar internal textures, it is also easy to see that the adjacent images are highly similar, and sometimes, features cannot be extracted [3]. Therefore, solving the problem of axial velocity estimation has become a significant challenge in pipeline flight.

Several advanced methods have been proposed in the literature for UAV localization and velocity estimation. Techniques such as deep learning-based models, large-scale 3D mapping, and advanced sensor fusion algorithms have shown promising results in various scenarios. For example, physics-informed neural networks (PINNs) integrate governing physical laws into neural network models to enhance their predictive capabilities [4,5]. Similarly, large language models (LLMs) and visual language models (VLMs) have been applied to tasks like blurred image motion estimation, demonstrating significant potential for improving UAV navigation accuracy [6,7,8]. Additionally, sensor fusion techniques that combine data from LiDAR, radar, and other sensors have been employed to improve the robustness of UAV positioning systems in complex environments [9].

However, the implementation of these advanced methods is currently constrained by the limited computing power available on UAV on-board computers. Real-time execution of large models or computationally intensive algorithms remains a challenge. As a result, these methods cannot yet be deployed on UAVs for real-time applications. We anticipate that with, advancements in big model technology and on-board computer capabilities, it will become feasible to deploy these sophisticated methods on UAVs. Future developments in computational hardware and the optimization of algorithms will play a crucial role in overcoming these limitations and enhancing the performance of UAV navigation systems.

In the existing research, there are mainly two ways to solve the axial speed: first, by hoping that the internal texture can be recognized, the axial speed of the UAV can be calculated by using the camera or optical flow sensor and other detection equipment through high-precision processing of the picture; second, through additional equipment, such as ultra-wideband (UWB), a base station is established near the take-off point of the UAV, and the UAV itself carries a signal receiver, so as to use the real-time UWB signal to estimate the axial position.

The above two methods also have their inherent disadvantages. The first method depends on the texture pattern of the inner wall of the tunnel. In an environment with rich pattern features, it can solve the problem excellently and stably, but it cannot work well for scenes without feature texture, such as the tunnel with perennial flushing [10]. The second way is to obtain additional information through additional physical equipment, but UWB and other signals are in the tunnel, which are prone to wall reflection interference. Through actual experiments, it can be known that UWB signals can work normally in part of the range, but when the UAV is too far away or close to the wall, UWB signals cannot work well, and even have a large mutation or signal loss [11].

In the existing pipeline UAV flight, for different problems such as obstacle avoidance safety [12], positioning [13], and navigation [14], researchers use multi-sensor fusion [15,16], optimized pose estimation algorithms [17], optimized system structures [18], and other means to solve specific scene problems. In practical applications, it is common to shoot moving objects by installing fixed cameras to estimate the speed of objects, but the methods of speed estimation are different. J. A. Dwicahya et al. [19] proposed a method for estimating the speed of moving objects based on grayscale images of blurred images, which is achieved by analyzing the blur degree of moving objects. For a moving camera and stationary objects, it is also similar to this method, that is using the imaging of the relative motion of the two objects to estimate the relative velocity.

In 2018, H. Li et al. [14] calculated the speed and pose of a UAV by using geometric methods through the change rate of the optical flow field contained in each pixel for each frame of the image. The method of calculating the position and velocity without relying on positioning in this paper solves the problems of cumulative error and matching failure. However, there are many assumptions and definitions in the method. At the same time, the three-dimensional structure of the tunnel is recognized as a vertically stretched body of plane graphics in the study, which is different with most of the actual environment.

For a picture formed by motion, in addition to the pixel information contained in the picture itself, there is also a key piece of information: blur. For visual SLAM, blur is an interference factor that affects feature matching. Therefore, many researchers improve the accuracy of feature matching and reduce the probability of feature mismatch by studying the blurred kernel or the blur degree of the picture or by screening out the blurred picture.

But, for a single picture, the degree of blur also reflects the instantaneous speed in this direction. Considering that this section only discusses the speed estimation in the axis direction, the fuzzy interference caused by the UAV in other directions during flight is temporarily ignored.

In light of the prevailing circumstances and in consideration of the payload constraints imposed by a small-sized UAV, limitations on sensor weight and dimensions become evident. Consequently, this study proposes a methodology for estimating the axial velocity of the UAV in the analysis of a motion blur image. A monocular camera is fixed under the UAV, with its optical axis aligned parallel with the ground when the UAV is in a horizontal orientation. This configuration facilitates the estimation of the axial speed. The present study advocates utilizing motion blur images captured by the monocular camera during UAV motion, then establishing the relationship between the motion blur length and the actual movement distance of the UAV, and finally, estimating the axial velocity of the UAV. Recognizing the limitations of single-frame picture blur in scenarios where blur is not obvious at lower speeds, this study employs the synthesis of motion blur images from adjacent frames to enhance the accuracy of axial velocity estimation.

The primary contributions of this study can be delineated as follows:

A solution to the problem of axial direction motion speed estimation in confined environments. Addressing the challenges associated with the issues of dark light, weak texture, and axial direction consistency in LiDAR point clouds within confined environments, such as corridors, existing methodologies employed in visual SLAM or laser SLAM confront difficulties in accurately estimating axial direction speed. This study proposes an approach utilizing motion blur images captured by a monocular camera during UAV movement. The objective is to establish a relationship between the length of motion blur and the axial direction speed of the UAV. It can solve the problem of axial direction motion speed estimation.
Field experiments confirmed the efficacy of the synthetic neighboring frame motion blur algorithm. Recognizing the limitations of single-frame picture blur effectiveness, particularly in scenarios where the degree of blur is not obvious at lower speeds, the research introduces a novel synthetic neighboring frame motion blur algorithm. This algorithm is designed for real-time estimation of axial velocity, presenting an innovative solution to existing challenges in axial velocity estimation. The outcomes of field experiments substantiate the efficacy of the proposed algorithm, demonstrating commendable stability and reliability in its performance.

To address the challenges of estimating the UAV’s axial velocity in environments with limited geometric information, we propose a novel approach based on the analysis of motion blur in images captured by a monocular camera. This methodology leverages the relationship between the blur length observed in an image and the UAV’s actual movement during the camera’s exposure time. By converting the spatial domain image to the frequency domain, we can effectively analyze the motion blur and extract meaningful velocity estimates. The following sections detail the theoretical underpinnings of our approach, the experimental setup, and the procedures employed to validate our model.

2. Theoretical Calculation Method

2.1. Camera Imaging Relationship

The first problem to be addressed is how to estimate the actual motion length based on the degree of blur observed in the images. This blur length, which we consider in the axis direction, is directly related to the UAV’s motion when it maintains flight along the central axis. The blur length in an image is a result of the UAV’s movement during the camera’s exposure time.

Assuming the blur length l in the image is known, the actual movement distance L within the exposure time can be calculated using the imaging relationship expressed in Equation (1) and illustrated in Figure 1.

L = \frac{H}{h} l

(1)

where L is the actual movement distance within the exposure time, H is the height from the lens to the wall, l is the blur length in the image, and h is the focal length of the camera.

After determining the movement distance L within the exposure time

t_{2}

, we account for the hardware performance characteristics. The exposure time

t_{2}

is approximately constant. However, there is also a non-exposure time

t_{1}

between frames. For the selected monocular camera, the frame rate is 30 fps, the non-exposure time

t_{1}

is 0.0233 s, and the focal length h is 1.95 mm. The proportion of exposure time r can be calculated using Equation (2):

r = \frac{t_{2}}{t_{2} + t_{1}}

(2)

Given this proportion, the actual displacement distance

L_{r}

over one frame period can be expressed as Equation (3):

L_{r} = \frac{L}{r}

(3)

In the axis direction (assumed to be the x-direction), the speed estimation

v_{x}

can then be expressed as in Equation (4):

v_{x} = \frac{L}{t_{2}} = \frac{L_{r}}{t_{2} + t_{1}}

(4)

The process described above explains in detail the relationship between the length of the motion blur observed in the image and the actual distance traveled by the UAV, and how this relationship can be used to estimate the speed of the UAV.

2.2. Motion Blur Length Estimation

It can be seen from the previous section that the estimation of axial velocity can be transformed into the estimation of the blurred length on a single picture.

In the study of the blurred degree, the relationship between the blurred image and the original image is usually expressed as the following Equation (5):

g (x, y) = f (x, y) * h (x, y) + n (x, y)

(5)

where

g (x, y)

is the characterization value of the degraded blurred image,

f (x, y)

is the characterization value of the original image,

h (x, y)

is the motion blur function,

n (x, y)

is Gaussian noise, and ∗ represents the convolution.

Convolution is a time-consuming process, whether in the actual calculation process or considering the position and attitude update frequency required by the UAV’s real-time flight. Therefore, the convolution in the time domain is converted to the product in the frequency domain to reduce the computational complexity and time consumption. The process begins with applying the Fourier transform to the image and converting it from the spatial domain to the frequency domain, represented mathematically by

F (I (x, y))

, where

I (x, y)

denotes the image intensity at pixel coordinates

(x, y)

. This transformation highlights periodic patterns and noise as distinct frequencies. To mitigate the impact of bright cross-lines, which typically correspond to high-frequency noise, we create a binary mask that suppresses these frequencies in the Fourier domain. This mask is applied to the Fourier-transformed image, resulting in a filtered frequency domain representation. The inverse Fourier transform is then applied to revert the image back to the spatial domain, yielding a cleaner image with reduced noise.

Motion blur is mainly related to two factors: the length of motion per unit time (i.e., velocity) and the direction of motion.

In addition, M. E. Moghaddam et al. mentioned in [20], “if the image is affected by motion blur, then in its frequency response, we can see that the main parallel lines correspond to very low values close to zero”, as shown in the following Figure 2.

Through the analysis and calculation of the main parallel lines, the length [21] and direction angle [22] of the instantaneous motion can be obtained. Considering that this section only studies the motion speed along the axis, the direction angle is not described.

The study provides a relationship between the width and blurred length of parallel lines as Equation (6):

l \propto \frac{1}{d}

(6)

where d is the width between the parallel lines and l is the blurred length. According to the experimental data fitting, the following empirical formula is given as Equation (7):

l = \frac{c_{1} + c_{2} d + c_{3} d^{2} + c_{4} d^{3}}{1 + c_{5} d + c_{6} d^{2} + c_{7} d^{3} + c_{4} d^{4}}

(7)

However, the formula of this method is too complex and does not make use of the relevant information in the frequency domain of the blur function. J. A. Cortés-Osorio et al. [23] made a new proposal to estimate the amount of motion from a single motion blurred image using the discrete cosine transform (DCT). That is, let the frequency response of the motion blur function

H (x, y)

be 0, so as to calculate the blur length.

The frequency of the blurred function is correspondingly expressed as:

H (u) = \frac{sin (\frac{l u π}{N})}{l (\frac{u π}{N})}, 0 \leq u \leq N - 1

(8)

H (u) = 0 ⟺ sinc (w_{c} l) = 0, l = \frac{n π}{w_{c}}, n = 0, 1, 2, \dots

(9)

According to the above three formulas, the blurred length can be calculated. According to the blurred length, the actual moving distance can be calculated, so as to calculate the instantaneous speed of the UAV at that time.

In order to extract and measure the stripes to obtain the motion blur length and motion blur direction, we employed edge detection techniques such as the Canny edge detector to highlight significant transitions in intensity. The Hough transform is then utilized to detect straight lines in the edge-detected image, where lines are parameterized by their distance and angle from the origin in polar coordinates

(ρ, θ)

. The length of each detected line segment is calculated using the Euclidean distance, and its orientation is derived from the Hough transform parameters. This method enables precise extraction and characterization of stripe features in the image, facilitating further analysis. The process of removing bright cross-lines from the Fourier spectrum, as well as extracting the orientation and inclination of the stripes is described in Algorithm 1.

Algorithm 1:Remove cross-lines and extract stripes in Fourier spectrum

1:: Input: Image $I (x, y)$
2:: Output: Cleaned image with stripes measured
3:: Apply Fourier transform to the image
4:: $F \leftarrow FFT 2 (I)$
5:: Shift the zero-frequency component to the center
6:: $F_{shift} \leftarrow FFTShift (F)$
7:: Create a binary mask to suppress bright cross-lines
8:: Initialize mask M of same size as F with ones
9:: Define $M a s k_S i z e$
10:: $C e n t e r_R o w, C e n t e r_C o l \leftarrow Size (F) / 2$
11:: for i from $(C e n t e r_R o w - M a s k_S i z e)$ to $(C e n t e r_R o w + M a s k_S i z e)$ do
12:: for j from 1 to $Size (F, 2)$ do
13:: $M (i, j) \leftarrow 0$
14:: end for
15:: end for
16:: for j from $(C e n t e r_C o l - M a s k_S i z e)$ to $(C e n t e r_C o l + M a s k_S i z e)$ do
17:: for i from 1 to $Size (F, 1)$ do
18:: $M (i, j) \leftarrow 0$
19:: end for
20:: end for
21:: Apply the mask to the shifted Fourier image
22:: $F_{filtered} \leftarrow F_{shift} \times M$
23:: Inverse shift and apply inverse Fourier transform
24:: $F_{ishift} \leftarrow IFFTShift (F_{filtered})$
25:: $I_{clean} \leftarrow IFFT 2 (F_{ishift})$
26:: $I_{clean} \leftarrow Abs (I_{clean})$
27:: Detect edges using the Canny edge detector
28:: $E d g e s \leftarrow Canny (I_{clean}, threshold 1, threshold 2)$
29:: Apply Hough transform to detect lines
30:: $L i n e s \leftarrow HoughLines (E d g e s, ρ, θ, threshold)$
31:: Initialize lists for lengths and orientations
32:: $S t r i p e_L e n g t h s \leftarrow []$
33:: $S t r i p e_O r i e n t a t i o n s \leftarrow []$
34:: for each detected line $(ρ, θ)$ in $L i n e s$ do
35:: $a \leftarrow cos (θ)$
36:: $b \leftarrow sin (θ)$
37:: $x 0 \leftarrow a \times ρ$
38:: $y 0 \leftarrow b \times ρ$
39:: $x 1 \leftarrow x 0 + 1000 \times (- b)$
40:: $y 1 \leftarrow y 0 + 1000 \times (a)$
41:: $x 2 \leftarrow x 0 - 1000 \times (- b)$
42:: $y 2 \leftarrow y 0 - 1000 \times (a)$
43:: $L e n g t h \leftarrow \sqrt{{(x 2 - x 1)}^{2} + {(y 2 - y 1)}^{2}}$
44:: $O r i e n t a t i o n \leftarrow atan 2 ((y 2 - y 1), (x 2 - x 1)) \times \frac{180}{π}$
45:: Append $L e n g t h$ to $S t r i p e_L e n g t h s$
46:: Append $O r i e n t a t i o n$ to $S t r i p e_O r i e n t a t i o n s$
47:: end for
48:: Return $I_{clean}, S t r i p e_L e n g t h s, S t r i p e_O r i e n t a t i o n s$

Algorithm 1 begins by applying the Fourier transform to convert the image to the frequency domain. A mask is then created to suppress bright cross-lines, representing high-frequency noise. After applying the mask and performing an inverse Fourier transform, the image is reverted to the spatial domain with reduced noise. Edge detection and the Hough transform are used to detect and measure stripes, calculating their lengths and orientations. The final output includes the cleaned image and measurements of the detected stripes.

2.3. Motion Blur Map Synthesized from Neighboring Frames

Recognizing the limitations of single-frame picture blur in scenarios where blur is not obvious at lower speeds, this study employs the synthesis of motion blur images from adjacent frames to enhance the accuracy of axial velocity estimation. Consider a pixel at location

(x, y)

in an image

I (x, y, t)

at time t. The multi-frame accumulation can be expressed as Equation (10):

\bar{I} (x, y) = \frac{1}{N} \sum_{t = 1}^{N} I (x, y, t)

(10)

where N is the number of frames.

If the pixel moves to a new location

(x + δ x, y + δ y)

in a subsequent image at time

t + δ t

, the intensity of the pixel is assumed to remain constant. This assumption leads to the optical flow constraint Equation (11) [24,25]. The equation assumes that the intensity of a particular point in an image remains constant over time as it moves from one frame to the next [26,27].

I (x, y, t) = I (x + δ x, y + δ y, t + δ t)

(11)

Taking the first-order Taylor expansion, we obtain Equation (12) [24]:

I (x, y, t) = I (x + δ x, y + δ y, t + δ t) \approx I (x, y, t) + \frac{\partial I}{\partial x} δ x + \frac{\partial I}{\partial y} δ y + \frac{\partial I}{\partial t} δ t

(12)

Simplifying Equation (12), we obtain Equation (13):

0 \approx \frac{\partial I}{\partial x} δ x + \frac{\partial I}{\partial y} δ y + \frac{\partial I}{\partial t} δ t

(13)

Let

u = \frac{\partial x}{\partial t}

and

v = \frac{\partial x}{\partial t}

be the horizontal and vertical components of the optical flow, respectively. Then, the optical flow constraint equation can be written as Equation (14) [27,28]:

\frac{\partial I}{\partial x} u + \frac{\partial I}{\partial y} v + \frac{\partial I}{\partial t} = 0

(14)

The optical flow constraint equation provides a single linear equation for two unknowns (u and v). This system is under-determined, meaning there are infinitely many solutions for the motion vector

(u, v)

that satisfy the equation. This is the source of direct motion ambiguity.

The accumulation of multiple frames can help resolve motion ambiguity by providing additional constraints. The principle of synthesizing motion blur from neighboring frames is based on the motion blur of an image in continuous time. Suppose we have a series of consecutive frames of an image (

I_{1}

I_{2}

, …,

I_{N}

) with object motion between each frame. We can accumulate a multi-frame motion blur effect, as shown in Figure 3.

Consider the accumulated image gradient as Equation (15):

{\bar{I}}_{x} = \frac{1}{N} \sum_{t = 1}^{N} \frac{\partial I (x, y, t)}{\partial x}, {\bar{I}}_{y} = \frac{1}{N} \sum_{t = 1}^{N} \frac{\partial I (x, y, t)}{\partial y}, {\bar{I}}_{t} = \frac{1}{N} \sum_{t = 1}^{N} \frac{\partial I (x, y, t)}{\partial t}

(15)

The optical flow constraint equation can now be written in terms of these accumulated gradients as Equation (16):

{\bar{I}}_{x} u + {\bar{I}}_{y} v + {\bar{I}}_{t} = 0

(16)

By incorporating multiple frames, the effective gradient terms

{\bar{I}}_{x}

and

{\bar{I}}_{y}

become more robust, reducing the impact of noise and improving the accuracy of the motion estimation. This effectively decreases the ambiguity in determining the motion parameters u and v.

To derive the equivalence between multi-frame-synthesized motion blur and direct motion blur, let us consider the error terms in the motion-estimation process:

E (u, v) = \sum_{t = 1}^{N} {(\frac{\partial I (x, y, t)}{\partial x} u + \frac{\partial I (x, y, t)}{\partial y} v + \frac{\partial I (x, y, t)}{\partial t})}^{2}

(17)

For multi-frame accumulation, the error term can be written as Equation (18):

E_{acc} (u, v) = N {({\bar{I}}_{x} u + {\bar{I}}_{y} v + {\bar{I}}_{t})}^{2}

(18)

For direct motion ambiguity, considering a single frame, the error term is Equation (19):

E_{single} (u, v) = {(\frac{\partial I (x, y)}{\partial x} u + \frac{\partial I (x, y)}{\partial y} v + \frac{\partial I (x, y)}{\partial t})}^{2}

(19)

To establish the equivalence, we need to show that the accumulation of multiple frames provides a better constraint system for the motion parameters:

lim_{N \to \infty} E_{acc} (u, v) = E_{single} (u, v)

(20)

To simplify the operation of synthesizing images, we use the discrete form, which changes the integral equation into a discrete accumulation, and the continuous motion on the time axis into a discrete sequence of frames. For a discrete sequence of frames, the time interval between each frame is

Δ t

. The discrete motion blur equation is Equation (21):

I (x, y) = \frac{1}{N} \sum_{i = 1}^{N} I_{i} (x - V_{i} i Δ t, y - U_{i} i Δ t)

(21)

where the chosen camera frame rate is 30 fps, then

Δ t = \frac{1}{30} s

3. Experimental Results and Analysis

3.1. Introduction to the UAV Flight Experiment Platform and Speed Measurement Equipment

In our study, we employed the JCV 600 UAV as the experimental platform, equipped with a small processor and a variety of sensors, such as a binocular camera, 2D LiDAR, and a monocular camera, as shown in Figure 4. The UAV allows real-time observation of the forward environment through mobile devices and has a flight endurance of 70 min with no load and 30 min with full load. The on-board small processor is the DJI Manifold 2-C. The Manifold 2-C is a high-performance on-board computer designed specifically for robots, with excellent processing power and response speed, flexible expansion, running the Robot Operating System (ROS) under Ubuntu 18.04. The single-point laser ranging sensor we have chosen is the TOFSense-F P [29]. The range of this ranging sensor is 0.5–25 m, with a distance resolution of 1mm. The data-update frequency can reach up to 350 HZ. The HD USB color monocular camera, model RER-USB4KHDR01, incorporates a Sony IMX317 sensor with specifications including a resolution of 1920 × 1080, an aperture of 1/2.5, a focal length of 1.95 mm, a frame rate of 30 FPS, and an exposure time of 10 ms (

t_{1}

= 0.023 s,

t_{2}

= 0.01 s).

To accurately measure the position and velocity of the UAV, we employed an ultra-wideband (UWB) system, which is mounted as shown in Figure 4b. The UWB system used in our experiments is the PNTC local positioning system from Nooploop’s LinkTrack [30]. This system provides high-precision positioning with an accuracy of 10 cm. It supports distributed ranging and digital transmission, with a bandwidth of up to 3 Mbps, a maximum refresh frequency of 200 Hz, and a communication range of up to 200 m. The velocity measured by this UWB system was used as a reference value for validating our motion blur-based velocity estimation method.

3.2. Single-Frame and Multi-Frame Synthetic Motion Blur Experiments

In our initial experiments, we focused on understanding the motion blur characteristics of single-frame and multi-frame images. Single-frame blur was analyzed to establish a direct relationship between the blur length and the UAV speed. However, recognizing the limitations of single-frame analysis at lower speeds where blur is not apparent, we synthesized motion blur images from adjacent frames. This multi-frame synthesis involved combining several consecutive frames to enhance the accuracy of axial velocity estimation. The approach allowed us to create more pronounced blur patterns, facilitating more reliable measurements.

The UAV’s motion during the short exposure time can be regarded as uniform linear motion. During high-speed motion, the blur observed in the monocular camera video stream is influenced solely by the UAV’s speed and motion angle. For our experiments, we assumed the UAV moves along the axis direction with the monocular camera mounted directly below it. This setup ensures the blur is related only to the motion speed, with the motion angle always being 0° (i.e., no consideration of UAV jitter), resulting in horizontal or vertical blur.

Taking horizontal blur as an example, the experiment outputs the length of blur from a single input blurred picture frame and calculates the actual displacement length and instantaneous velocity according to the blur length. By processing the images in the frequency domain and binarizing them, we obtained the results shown in Figure 5.

In frequency domain imaging, the center of the image represents high-frequency components, while the edges represent low-frequency components. The blurred image displays parallel stripes with equal intervals, consistent with the theoretical expectations. These images were generated by manually adding horizontal blur to ensure that the blur observed was due to motion, not defocus. The horizontal motion produces blur and forms vertical stripes in the frequency domain, aligning with experimental expectations. The pixel width of the image is 2000, and after binarization, the blur length was determined to be 63 pixels, matching the experimental expectations.

In actual video streams, the frequency domain effect is not as ideal as manually produced blurred images. The overall effect shows a bright stripe at an angle rather than alternating stripes, as illustrated in Figure 6. This indicates motion blur rather than defocus blur, but the alternating light and dark stripes are not as clear, complicating the calculation of blur length.

Given the high update frame rate of the actual video stream (approximately 20 Hz for attitude updates and 30 fps for the monocular camera), we combined multiple frames into a single image to simulate the effect of artificial blur. The short interval between adjacent frames (30 fps) allows us to assume uniform linear motion between merged frames. The frequency domain transformation of these composite images yields more ideal results, as shown in Figure 7.

From a single-frame image, we can only determine whether the blur is due to motion, without obtaining the effective results for motion direction and blur length. However, the multi-frame frequency domain analysis in Figure 7c reveals alternating light and dark stripes similar to the ideal case, allowing for result analysis and calculation.

The number of superimposed frames is limited by the device’s frame rate, the minimum required update frequency, and the merging effect. In our experiments, we decomposed and synthesized the video stream and performed frequency domain transformation on synthetic images with varying frame numbers. The results are shown in Figure 8.

As illustrated in Figure 8, the number of synthesized frames significantly impacts the results in the frequency domain. The specific choice of frame number is closely related to the frame rate of the vision device. When the number of frames is too high, such as 10 or 20 frames, the difference in state between the start and end frames becomes pronounced. This discrepancy affects the final frequency domain results, preventing the clear appearance of alternating light and dark stripes and complicating the calculation of the motion blur length. Conversely, when the number of frames is too low, for example two frames, obvious alternating stripes are not discernible.

Optimal results were obtained with five frames, where clear alternating light and dark stripes were observed. This number of frames allows for more accurate extraction of stripe spacing, balances the computational load, and enhances the efficiency of the motion blur length computation.

3.3. Calibration of the Relational Model of Motion Speed and Motion Blur Length

The theoretical model of UAV speed versus estimated image blur length is derived from Equations (1), (2) and (4). Using this model, we plot the theoretical curves of UWB speed versus estimated image blur length, based on the camera parameters, exposure time, focal length, distance between the UAV and the wall, and estimated motion blur length, as depicted by the blue line in Figure 9. To validate our theoretical model, we conducted five repetitive experiments and plotted a series of scatter plots based on the average UWB-measured UAV speeds and the average estimated motion blur lengths using our method. These scatter plots, shown as black x’s in Figure 9, were then used to fit a measurement curve. The resulting fitted curve, represented by the red dashed line in Figure 9, illustrates the relationship between the axial speeds measured by the UAVs and the measured image blur length.

For the theoretical straight line between the center-axis velocity of the UAV and the length of the motion blur, (1) the slope is 0.0202 and (2) the intercept is 0; the linear regression analysis yielded the following parameters for the fitted line of the measured data: (1) slope: 0.0199, (2) intercept: 0.0047. To quantify the error between the measured points and the theoretical curve, we calculated the residual statistics. The mean residual value indicates that, on average, the measured data points deviate from the fitted line by 0.0404 m/s. The maximum residual shows the largest deviation at 0.1341 m/s, while the minimum absolute residual is very close to zero, indicating a high degree of accuracy in some measurements. The RMS residual of 0.0535 m/s further supports the overall accuracy of our method.

The consistency between the theoretical curve and the measured data validates our approach and underscores the effectiveness of using motion blur for axial velocity estimation in UAV applications. The experimental results, combined with the theoretical predictions, confirm that our method can reliably estimate the speed of a UAV in a confined environment, such as a tunnel or corridor.

By fitting the measured curve to the theoretical curve, we demonstrated that our theoretical derivation accurately models the real-world behavior of UAV motion blur. This strong correlation between the theoretical and measured curves enhances the credibility of our method and supports its application in various UAV navigation and velocity estimation scenarios.

3.4. Field Flight Experiments in the Corridor of a Hydroelectric Power Plant

To further validate our method, we conducted field flight experiments in the corridor of a hydroelectric power plant. The UAV was flown in various sections of the corridor, including straight, stairway, and corner sections, as shown in Figure 10. Six experiments were undertaken, with the UWB truth velocity and motion blur velocity estimation methodologies used for UAV localization.

Each section was subjected to two flight tests, utilizing the UWB reference velocity and motion blur velocity estimation methodologies for UAV localization, respectively. The experimental results are presented in Table 1 below.

As shown in Table 1, the mean errors across the six UAV experiments conducted in distinct sections of the corridor remained consistently within the acceptable range of

0.065

m/s. The root-mean-squared error was confined to 0.3, and the minimum error attained a value of

3 \times 10^{- 4}

m/s. Notably, the UWB signal in the corner section experienced interference due to reflections from the walls, contributing to a maximum error of 0.6034 m/s in the UWB-based localization.

During the fifth experiment, the UAV exhibited drift and encountered difficulties navigating corners based on UWB-based localization. This large error was primarily due to poor UWB signal reception at the corridor corners, leading to reduced localization accuracy and unreliable speed measurements. Consequently, the UAV’s flight was not smooth during the fifth test, as the localization errors caused instability.

However, in the subsequent sixth experiment utilizing motion blur velocity measurement, the UAV demonstrated stable flight and could smoothly turn around the corners of the corridor. This improvement highlights the effectiveness of our method in mitigating the issues caused by UWB signal interference. Across these six flight experiments, the flight based on motion blur speed estimation performed well, affirming the stability of the designed algorithm.

Taking the straight section test as an example, the mean error in UAV speed based on motion blur measurement and UWB-based speed measurement was 0.039 m/s, with a root-mean-squared error of 0.1964 m/s. The maximum error and minimum speed error were recorded at 0.4812 m/s and

1.319 \times 10^{- 4}

m/s, respectively. Figure 11 visually confirms the consistency of the two curves, depicting the overlapping trends in velocity changes. The mean error, consistently within an acceptable range, further attests to the efficacy of the employed methodologies.

The experimental results demonstrated that the proposed algorithm provided reliable velocity estimates, even in challenging environments with poor lighting and weak texture. The consistency between the theoretical curve and the measured data validated our approach and underscored the effectiveness of using motion blur for axial velocity estimation in UAV applications.

4. Discussion of Limitations

While our method shows promise for estimating UAV axial velocity using motion blur, it is not without limitations. First, the accuracy of our approach relies on the assumption that the UAV’s movement is minimal and the surrounding light intensity remains constant within the selected frame rate of the camera (30 fps). We selected five neighboring frames, which translates into a duration of 1/6 s, under the assumption that the UAV’s movement during this period is small and the light conditions do not vary significantly. This assumption may not hold in all real-world scenarios, particularly in environments with variable lighting or more significant UAV movement.

Additionally, the method’s accuracy in speed measurement is relatively low. While it is suitable for applications where approximate speed estimation is sufficient, it may not be appropriate for fields that require precise speed measurements. The limitations of using motion blur for velocity estimation include potential errors introduced by variations in lighting conditions, UAV jitter, and the inherent assumption of uniform motion between frames. These factors can affect the reliability of the speed estimation, particularly in more complex or dynamic environments.

In summary, while our method provides a novel approach for estimating UAV axial velocity in confined environments such as tunnels or corridors, it is important to recognize these limitations and consider them when interpreting the results. Future work should focus on addressing these limitations by exploring methods to account for variable lighting conditions and more significant UAV movements, as well as enhancing the precision of speed measurements.

5. Conclusions

This study successfully ascertains the axial velocity of a UAV through an analysis of blurred images within the monocular camera’s video stream. The methodology involves synthesizing images with minor blur from adjacent frames into a distinctly blurred composite, which is then converted to the frequency domain to extract the motion blur length. The UAV’s axial speed is computed by correlating the motion blur length with the actual movement distance.

Experimental validations within the corridors of a hydropower station confirmed the feasibility and stability of the proposed algorithm, effectively estimating the UAV’s speed even in challenging environments with poor lighting and weak textures. The results show that the mean errors across six UAV experiments were consistently within the acceptable range of 0.065 m/s, with the root-mean-squared error confined to 0.3 and a minimum error of

3 \times 10^{- 4}

m/s. Notably, interference from wall reflections in the corner section caused a maximum error of 0.6034 m/s in the UWB-based localization. During the fifth experiment, poor UWB signal reception led to reduced localization accuracy and unreliable speed measurements. However, the sixth experiment, using the proposed motion blur-velocity-measurement method, demonstrated stable flight and smooth navigation around corners, highlighting the method’s effectiveness in mitigating UWB signal interference.

The experimental results underscore the potential of motion blur-based velocity estimation for UAVs, especially in confined environments like tunnels or corridors. The method’s robustness in handling poor lighting and weak textures suggests its broader applicability in UAV navigation. Future work should focus on refining the algorithm, exploring diverse environments, and integrating advanced sensor fusion techniques to enhance accuracy and reliability. This research marks a significant stride in advancing image-processing techniques for UAV axial velocity determination, with practical implications across a spectrum of applications.

Author Contributions

Conceptualization, Y.M. and Q.Z.; methodology, Q.Z. and Y.M.; software, Q.Z.; validation, Y.M., Q.Z., L.Y. and C.Z.; formal analysis, Q.Z. and R.S.; investigation, Y.M., Q.Z. and L.Y.; resources, Y.M., Q.Z. and G.X.; data curation, Y.M. and R.S.; writing—original draft preparation, Y.M.; writing—review and editing, Y.M., Q.Z., L.Y., C.Z. and R.S.; visualization, Y.M.; supervision, Q.Z. and G.X.; project administration, Q.Z.; funding acquisition, Q.Z. and G.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Hubei Technology Innovation Center for Smart Hydropower, Wuhan Hubei 430000 (1520020005).

Data Availability Statement

Due to the nature of this research, participants of this study did not agree for their data to be shared publicly, so supporting data is not available.

Conflicts of Interest

Authors Yedong Mao, Chunhui Zhang and Ge Xu were employed by the company China Yangtze Power Co., Ltd., Technology and Research Center, Yichang City, China. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Xu, X.; Zhang, L.; Yang, J.; Cao, C.; Wang, W.; Ran, Y.; Tan, Z.; Luo, M. A Review of Multi-Sensor Fusion SLAM Systems Based on 3D LIDAR. Remote Sens. 2022, 14, 2835. [Google Scholar] [CrossRef]
Drocourt, C.; Delahoche, L.; Marhic, B.; Clerentin, A. Simultaneous localization and map construction method using omnidirectional stereoscopic information. In Proceedings of the IEEE International Conference on Robotics and Automation, Washington, DC, USA, 11–15 May 2002. [Google Scholar] [CrossRef]
Hao, Y.; He, M.; Liu, Y.; Liu, J.; Meng, Z. Range–Visual–Inertial Odometry with Coarse-to-Fine Image Registration Fusion for UAV Localization. Drones 2023, 7, 540. [Google Scholar] [CrossRef]
de Curtò, J.; de Zarzà, I. Hybrid State Estimation: Integrating Physics-Informed Neural Networks with Adaptive UKF for Dynamic Systems. Electronics 2024, 13, 2208. [Google Scholar] [CrossRef]
Gu, W.; Primatesta, S.; Rizzo, A. Physics-informed Neural Network for Quadrotor Dynamical Modeling. Robot. Auton. Syst. 2024, 171, 104569. [Google Scholar] [CrossRef]
Zhao, J.; Wang, Y.; Cai, Z.; Liu, N.; Wu, K.; Wang, Y. Learning Visual Representation for Autonomous Drone Navigation Via a Contrastive World Model. IEEE Trans. Artif. Intell. 2023, 5, 1263–1276. [Google Scholar] [CrossRef]
Phan, T.; Vo, K.; Le, D.; Doretto, G.; Adjeroh, D.; Le, N. ZEETAD: Adapting Pretrained Vision-Language Model for Zero-Shot End-to-End Temporal Action Detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 1–6 January 2024; pp. 7046–7055. [Google Scholar]
Li, L.; Xiao, J.; Chen, G.; Shao, J.; Zhuang, Y.; Chen, L. Zero-shot visual relation detection via composite visual cues from large language models. Adv. Neural Inf. Process. Syst. 2024, 36. [Google Scholar]
Bao, L.; Li, K.; Lee, J.; Dong, W.; Li, W.; Shin, K.; Kim, W. An Enhanced Indoor Three-Dimensional Localization System with Sensor Fusion Based on Ultra-Wideband Ranging and Dual Barometer Altimetry. Sensors 2024, 24, 3341. [Google Scholar] [CrossRef] [PubMed]
Yuksel, T. Sliding Surface Designs for Visual Servo Control of Quadrotors. Drones 2023, 7, 531. [Google Scholar] [CrossRef]
Pointon, H.A.G.; McLoughlin, B.J.; Matthews, C.; Bezombes, F.A. Towards a Model Based Sensor Measurement Variance Input for Extended Kalman Filter State Estimation. Drones 2019, 3, 19. [Google Scholar] [CrossRef]
Elmokadem, T. A 3D Reactive Navigation Method for UAVs in Unknown Tunnel-like Environments. In Proceedings of the 2020 Australian and New Zealand Control Conference (ANZCC), Gold Coast, Australia, 26–27 November 2020; pp. 119–124. [Google Scholar] [CrossRef]
Tan, C.H.; Ng, M.; Shaiful, D.S.B.; Win, S.K.H.; Ang, W.J.; Yeung, S.K.; Lim, H.B.; Do, M.N.; Foong, S. A smart unmanned aerial vehicle (UAV) based imaging system for inspection of deep hazardous tunnels. Water Pract. Technol. 2018, 13, 991–1000. [Google Scholar] [CrossRef]
Li, H.; Savkin, A.V. An Optical Flow based Tunnel Navigation Algorithm for a Flying Robot. In Proceedings of the 2018 IEEE International Conference on Robotics and Biomimetics (ROBIO), Kuala Lumpur, Malaysia, 12–15 December 2018; pp. 1767–1770. [Google Scholar] [CrossRef]
Ge, S.; Pan, F.; Wang, D.; Ning, P. Research on An Autonomous Tunnel Inspection UAV based on Visual Feature Extraction and Multi-sensor Fusion Indoor Navigation System. In Proceedings of the 2021 33rd Chinese Control and Decision Conference (CCDC), Kunming, China, 22–24 May 2021; pp. 6082–6089. [Google Scholar] [CrossRef]
Zhou, W.; Li, Y.; Peng, X. Autonomous Flight and Map Construction for Coal Fields in Vision UAV Sheds. Navig. Position. Timing 2018, 5, 32–36. [Google Scholar]
Özaslan, T.; Loianno, G.; Keller, J.; Taylor, C.J.; Kumar, V. Spatio-Temporally Smooth Local Mapping and State Estimation Inside Generalized Cylinders With Micro Aerial Vehicles. IEEE Robot. Autom. Lett. 2018, 3, 4209–4216. [Google Scholar] [CrossRef]
Jung, K. ALVIO: Adaptive Line and Point Feature-Based Visual Inertial Odometry for Robust Localization in Indoor Environments; Lecture Notes in Mechanical Engineering; Springer: Singapore, 2021; pp. 171–184. [Google Scholar] [CrossRef]
Dwicahya, J.A.; Ramadijanti, N.; Basuki, A. Moving Object Velocity Detection Based on Motion Blur on Photos Using Gray Level. In Proceedings of the 2018 International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC), Bali, Indonesia, 29–30 October 2018; pp. 192–198. [Google Scholar] [CrossRef]
Moghaddam, M.E.; Jamzad, M. Finding point spread function of motion blur using Radon transform and modeling the motion length. In Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, Rome, Italy, 18–21 December 2004; pp. 314–317. [Google Scholar] [CrossRef]
Li, Q.; Yoshida, Y. Parameter Estimation and Restoration for Motion Blurred Images (Special Section on Digital Signal Processing). IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 1997, 80, 1430–1437. [Google Scholar]
Fei, X.; Hong, C.; Meng, C. An Algorithm of Image Restoration Based on Blur Parameter Identification with Cepstrum. Electron. Opt. Control 2011, 18, 49–54. [Google Scholar]
Cortés-Osorio, J. A.; Gómez-Mendoza, J. B.; Riaño-Rojas, J. C. Velocity Estimation From a Single Linear Motion Blurred Image Using Discrete Cosine Transform. IEEE Trans. Instrum. Meas. 2019, 68, 4038–4050. [Google Scholar] [CrossRef]
Berthold, K.P. Horn and Brian G. Schunck. Determining optical flow. Artif. Intell. 1981, 17, 185–203. [Google Scholar] [CrossRef]
Bereziat, D.; Herlin, I.; Younes, L. A generalized optical flow constraint and its physical interpretation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662), Hilton Head, SC, USA, 15 June 2000; Volume 2, pp. 487–492. [Google Scholar] [CrossRef]
Doshi, H.; Kiran, N.U. Constraint Based Refinement of Optical Flow. arXiv 2020, arXiv:2011.12267. Available online: https://api.semanticscholar.org/CorpusID:227151616 (accessed on 3 July 2024).
Del Bimbo, A.; Nesi, P.; Sanz, J.L.C. Analysis of optical flow constraints. IEEE Trans. Image Process. 1995, 4, 460–469. [Google Scholar] [CrossRef] [PubMed]
Del Bimbo, A.; Nesi, P.; Sanz, J.L.C. Optical flow computation using extended constraints. IEEE Trans. Image Process. 1996, 5, 720–739. [Google Scholar] [CrossRef]
TOFSense-F_Datasheet. Nooploop. Available online: https://ftp.nooploop.com/software/products/tofsense-f/doc/TOFSense-F_Datasheet_V1.2_en.pdf (accessed on 29 December 2023).
LinkTrack. Nooploop. Available online: https://www.nooploop.com/en/linktrack/ (accessed on 29 December 2023).

Figure 1. Schematic representation of the camera imaging relationship: blur length correlation with UAV movement distance during exposure time.

Figure 2. Grayscale and corresponding frequency images: (a) grayscale image; (b) corresponding frequency image.

Figure 3. Schematic of multi-frame synthesized motion blur (a). Frame 1 (b). Frame 2 (c). Frame 3 (d). Frame 4 (e). Frame 5 (f). Five-frame-synthesized motion blur image.

Figure 4. UAV flying in the corridor and JCV-600 UAV experimental platform (a). Drone flying in the corridor (b). JCV-600 UAV platform and UWB equipment for measuring velocity (c). JCV-600 UAV platform equipped with monocular camera, single-point laser, and fill light equipment.

Figure 5. Frequency domain experiment diagram. (a) Original blurred image, (b) corresponding frequency domain representation, and (c) binarized frequency domain image.

Figure 6. Frequency domain representation of an actual video frame from the UAV: (a) original blurred image; (b) corresponding frequency domain representation illustrating bright stripes indicating motion blur and highlighting the challenges in detecting alternating light and dark stripes.

Figure 7. Downward-facing images from the UAV onboard camera and the frequency domain diagram of multi-frame composites: (a) initial frame original diagram, (b) single-frame frequency domain diagram, and (c) multi-frame frequency domain diagram.

Figure 8. Synthesizing spectrograms of motion blur maps with different numbers of neighboring frames: (a) two neighboring frames; (b) five neighboring frames; (c) ten neighboring frames; (d) twenty neighboring frames.

Figure 9. The relationship between the axial velocity of the UAVs and the image blur length.

Figure 10. Field experiment diagram of hydropower station corridor: (a) drone flying in a long corridor; (b) drone flying on the stairs; (c) drone approaching a corner; (d) drone flying into a corner.

Figure 11. Speed comparison of UAV in straight section flight.

Table 1. Results of the six flight experiments.

Test Serial Number	Mean Error (m/s)	Minimum Error (m/s)	Maximum Error (m/s)	RMSE
1 (straight section)	0.039	$1.319 \times 10^{- 4}$	0.3812	0.1964
2 (straight section)	0.026	$1.489 \times 10^{- 4}$	0.3645	0.1688
3 (stairway section)	0.040	$1.616 \times 10^{- 4}$	0.3489	0.2016
4 (stairway section)	0.042	$1.539 \times 10^{- 4}$	0.3268	0.2198
5 (turning section)	0.061	$2.239 \times 10^{- 4}$	0.5673	0.2435
6 (turning section)	0.054	$2.846 \times 10^{- 4}$	0.6034	0.2652

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mao, Y.; Zhan, Q.; Yang, L.; Zhang, C.; Xu, G.; Shen, R. The Motion Estimation of Unmanned Aerial Vehicle Axial Velocity Using Blurred Images. Drones 2024, 8, 306. https://doi.org/10.3390/drones8070306

AMA Style

Mao Y, Zhan Q, Yang L, Zhang C, Xu G, Shen R. The Motion Estimation of Unmanned Aerial Vehicle Axial Velocity Using Blurred Images. Drones. 2024; 8(7):306. https://doi.org/10.3390/drones8070306

Chicago/Turabian Style

Mao, Yedong, Quanxi Zhan, Linchuan Yang, Chunhui Zhang, Ge Xu, and Runjie Shen. 2024. "The Motion Estimation of Unmanned Aerial Vehicle Axial Velocity Using Blurred Images" Drones 8, no. 7: 306. https://doi.org/10.3390/drones8070306

Article Menu