Open AccessArticle

Adaptive Fuzzy Fault-Tolerant Attitude Control for a Hypersonic Gliding Vehicle: A Policy-Iteration Approach

Meijie Liu

^1,2,

Changhua Hu

^1,*,

Hong Pei

¹,

Hongzeng Li

¹ and

Xiaoxiang Hu

204 Unit, Xi’an Research Institute of High-Tech, Xi’an 710025, China

Engineering Training Center, Xi’an University of Science and Technology, Xi’an 710054, China

School of Automation, Northwestern Polytechnical University, Xi’an 710072, China

Author to whom correspondence should be addressed.

Actuators 2024, 13(7), 259; https://doi.org/10.3390/act13070259

Submission received: 7 June 2024 / Revised: 1 July 2024 / Accepted: 4 July 2024 / Published: 9 July 2024

(This article belongs to the Section Control Systems)

Download

Browse Figures

Versions Notes

Abstract

In this paper, adaptive fuzzy fault-tolerant control (AFFTC) for the attitude control system of a hypersonic gliding vehicle (HGV) experiencing an actuator fault is proposed. Actuator faults of the HGV are considered with respect to its actual structure and actuator characteristics. The HGV’s attitude system is firstly represented by a T–S fuzzy model, and then a normal T–S fuzzy controller is designed. A reinforcement learning (RL)-based policy iterative solution algorithm is proposed for the solving of the T-S fuzzy controller. Then, based on the normal T–S controller, a fuzzy FTC controller is proposed in which the control matrices can improve themselves according to the special fault. An integral reinforcement learning (IRL)-based solving algorithm is proposed to reduce the dependence of the design methods on the HGV model. Simulations on three different kinds of actuator faults show that the designed IRL-based FTC can ensure a reliable flight by the HGV.

Keywords:

T–S fuzzy model; fault-tolerant control; reinforcement learning; hypersonic gliding vehicle

1. Introduction

A hypersonic gliding vehicle (HGV) is a vehicle with high flight speed and a large flight envelope, so its flight dynamics are really complex. As a basic element of aircraft control, the modeling and realization of HGV attitude control have received extensive attention in recent years. Various control methods are used to solve this problem, such as trajectory-linearization-based active disturbance rejection control [1], fuzzy logic-based adaptive control [2], reinforcement-learning-based nonlinear control [3], and so on. Linear parameter varying-based attitude control for the attitude system of an HGV is proposed in [4], and the sources of uncertainties in HGV control are analyzed, which are uncertainty of aerodynamic parameters and external disturbance. Robust nonlinear controllers for the attitude system of an HGV suffering from model disturbance have also received widespread attention.

In practice, faults may exist in the actuator of the HGV, and such failures necessarily reduce the control accuracy of the HGV, so FTC is an important research area in the attitude control of HGVs. FTC of HGVs has been widely studied in recent years. In ref. [5], actuator faults of an HGV’s attitude control system are considered, and adaptive FTC is proposed. In ref. [6], actuator faults of an HGV’s attitude control system are given in a more general model, and then FTC that can ensure the robustness of the system under actuator fault is given. In ref. [7], fixed-time-quantized fault-tolerant attitude control is presented for an HGV. In the above literature, the model of the HGV’s attitude system is assumed to be known, while in practice the nonlinear dynamics of the HGV may be unknown. In ref. [8], adaptive FTC is proposed for an HGV’s attitude system with an unknown inertial matrix and state constraints. In ref. [9], time-varying fault control of an HGV together with an adaptive FTC are proposed. Actuator faults and model uncertainties are considered synchronously and then sliding-mode-control-based FTC is proposed in ref. [10].

In the above literature, the specific form and occurrence time of the faults are all supposed to be already known, and then FTC is designed according to these known faults. This is unreasonable in practice, as the specific form and occurrence time of faults are difficult to obtain. In ref. [11], online computing-method-based FTC is proposed for the attitude control of HGV, but this method can deal with, only fixed-form faults. In ref. [12], iterative learning fault-tolerant control is proposed for time-varying industrial processes with actuator faults, and in ref. [13], stochastic actuator failures are considered for Markovian jump time-delayed systems. The FTC of the constrained system is solved by model predictive control in ref. [14]. The proposed controller design strategy can deal with some given faults but cannot adjust itself according to a specific fault online, so this is still an open problem. It should be noted that in order to facilitate practical use, the form of a given adaptive FTC should be as simple as possible.

The Takagi–Sugeno (T–S) fuzzy-based control design strategy can approach a nonlinear system with arbitrary accuracy, and the resulting T–S model is a combination of linear systems, so classical linear control methods can still be utilized by a parallel-distributed compensation (PDC) scheme [15]. T–S fuzzy technology is viewed as an efficient way for analyzing and designing the control of nonlinear systems and has been widely applied, such as type-2 T–S fuzzy-based tracking control of a saturation system [16] and T–S fuzzy control for semi-Markov jumps [17]. T–S fuzzy FTC of HGVs has also been utilized in [18]. In ref. [18], the upper and lower bounds of the faults are already known. How to design FTC for HGVs with faults of unknown specific form and unknown occurrence time is still an open problem.

Furthermore, in order to improve the self-learning ability of the FTC, an intelligent algorithm is needed. Commonly used intelligent algorithms include deep learning [19] and reinforcement learning (RL) [20]. The learning process of RL is similar to that of human beings. By interacting with the environment and updating the reward value in time, RL can obtain an optimal control law. RL does not need a specific model of the nonlinear system and can handle high-dimensional task scenarios, so it has been widely utilized by all trades and professions [21].

As a branch of RL, integral reinforcement learning (IRL) obtains its reinforcement signal by integrating the value function, and thereby can be utilized by unknown systems. Using only data from a completely unknown system, IRL can complete the assigned studying task, so it has been successfully applied to the optimal control of discrete-time multiagent systems [22], motion planning of autonomous vehicles [23], nuclear systems [24], and linear systems with input delay [25]. IRL-based fault-tolerant adaptive tracking control of Euler–Lagrange systems are also presented to improve the tracking performance of fault-tolerant control [26]. In a word, IRL has proved to be an effective means of solving FTC design of complex nonlinear systems [27].

Based on the above understanding, in this paper, an RL-based policy-iteration (PI) algorithm is utilized for the design of FTC of an HGV’s attitude system. A nonlinear model of the HGV’s attitude system is firstly, represented by a T–S fuzzy model, and then, the actuator fault model is built. A PI-based normal T–S fuzzy controller solving method is proposed without considering the actuator fault model. Based on the normal fuzzy controller, IRL-based fuzzy adaptive FTC is proposed. The control gains of the adaptive fuzzy FTC controller can self-adjust according to the special model of actuator fault. Finally, three simulation results are given to prove the effectiveness of the presented controller under different faults.

To sum up, the paper’s contributions are:

(1) A policy-iteration (PI) algorithm is utilized for the optimal controller design of a T–S fuzzy system.

(2) IRL-based adaptive fuzzy FTC is built for a T–S fuzzy system with actuator faults. With this FTC controller, the controller can be utilized online.

(3) This method is successfully applied to the attitude-tracking control of an HGV’s attitude system.

2. Problem Description

In this section, the nonlinear attitude model of an HGV is discussed, and then the nonlinear mode is represented by a T–S fuzzy system. Based on the attitude tracking control objective and the actuator fault model of the HGV, the control objective of the proposed T–S fuzzy model is discussed.

2.1. Nonlinear Model of an HGV’s Attitude System

The attitude model of an HGV used in this paper is

\{\begin{matrix} \dot{α} = ω_{z} - ω_{x} cos α tan β + ω_{y} sin α tan β \\ - sec β ((Y \frac{cos γ_{v}}{m V} - D \frac{sin γ_{v}}{m V} - \frac{m g cos θ}{m V}) cos γ_{v} \\ + (Y \frac{sin γ_{v}}{m V cos θ} - D \frac{cos γ_{v}}{m V cos θ}) cos θ sin γ_{v}) \\ \dot{β} = ω_{x} sin α + ω_{y} cos α \\ - (Y \frac{cos γ_{v}}{m V} - D \frac{sin γ_{v}}{m V} - \frac{m g cos θ}{m V}) sin γ_{v} \\ + (Y \frac{sin γ_{v}}{m V cos θ} - D \frac{cos γ_{v}}{m V cos θ}) cos θ cos γ_{v} \\ \dot{γ_{v}} = sec β (ω_{x} cos α - ω_{y} sin α) \\ + (Y \frac{cos γ_{v}}{m V} - D \frac{sin γ_{v}}{m V} - \frac{m g cos θ}{m V}) tan β cos γ_{v} \\ + (Y \frac{sin γ_{v}}{m V cos θ} - D \frac{cos γ_{v}}{m V cos θ}) \\ • (sin θ + tan β cos θ sin γ_{v}) \\ J_{x} \frac{d ω_{x}}{d t} + (J_{x} - J_{y}) ω_{z} ω_{y} = M_{x} \\ J_{y} \frac{d ω_{y}}{d t} + (J_{x} - J_{z}) ω_{x} ω_{z} = M_{y} \\ J_{z} \frac{d ω_{z}}{d t} + (J_{y} - J_{x}) ω_{y} ω_{x} = M_{z} \end{matrix}

(1)

where V is the velocity of the HGV,

θ

is the flight path angle and

ψ_{v}

is the heading angle,

L, D,

and Y are the lift force, drag force, and side force, respectively.

α

is the attack angle,

β

is the sideslip angle, and

γ_{v}

is the velocity inclination angle,

ω_{x}

ω_{y}

, and

ω_{z}

are angular velocity of three axes, respectively.

M_{x}

M_{y}

M_{z}

are the corresponding aerodynamic torques, and they are functions of

α

β

, and

γ_{v}

, and

δ_{e}

δ_{r}

, and

δ_{a},

where

δ_{e}

δ_{r}

, and

δ_{a}

are the elevator angle, yaw angle, and aileron angle, respectively. More details of the utilized mode can be found in ref. [5].

For the convenience of description, the nonlinear mode (1) can be rewritten as

\dot{x} (t) = F (x (t), u (t))

where

\begin{matrix} x (t) & = & {[\begin{matrix} α, & β, & γ_{v}, & ω_{x}, & ω_{y}, & ω_{z} \end{matrix}]}^{T}, \\ u (t) & = & {[\begin{matrix} δ_{e}, & δ_{r}, & δ_{a} \end{matrix}]}^{T} \end{matrix}

and the expression of

F (x (t), u (t))

can be found in (1).

2.2. T–S Fuzzy Modeling of an HGV

Since the parameter variation range of an HGV is big, a simple linear model derived by linearization of the equilibrium points is unsuitable for controller design. Considering the application convenience of the control system, T–S fuzzy modeling technology is utilized here.

For the application of T–S fuzzy modeling to an HGV’s attitude system, the premise variable is chosen as

α

. Then three levels for

α

are chosen, the specific values of which are listed in Table 1. Then, the HGV attitude system T–S model is built. The details are as follows:

(Rule 1) If

α

α_{S}

, then

\dot{x} (t) = A_{1} x (t) + B_{1} u (t),

(Rule 2) If

α

α_{M}

, then

\dot{x} (t) = A_{2} x (t) + B_{2} u (t),

(Rule 3) If

α

α_{B}

, then

\dot{x} (t) = A_{3} x (t) + B_{3} u (t),

A_{i}

and

B_{i}

(i = S, M, B)

are system matrices and

\begin{matrix} A_{i} & = & {\frac{\partial F}{\partial x} |}_{x = x_{i}, u = u_{i}}, \\ B_{i} & = & {\frac{\partial F}{\partial u} |}_{x = x_{i}, u = u_{i}}, \end{matrix}

where

x_{i} = (α_{i}, β_{i}, γ_{v i}),

i = (S, M, B)

u_{i}

is the equilibrium point associated with

x_{i}

Based on the T–S fuzzy model, the overall T–S fuzzy model of the HGV’s attitude model can be represented by:

\begin{matrix} \dot{x} (t) & = & \overset{3}{\sum_{i = 1}} μ_{i} (t) (A_{i} x (t) + B_{i} u (t)) \\ = & \bar{A} x (t) + \bar{B} u (t) \end{matrix}

(2)

where

\bar{A} = \overset{3}{\sum_{i = 1}} μ_{i} (t) A_{i}, \bar{B} = \overset{3}{\sum_{i = 1}} μ_{i} (t) B_{i}

and

μ_{i} (t)

is the membership function,

μ_{i} (t) \geq 0, i = 1, 2, 3

and

\sum_{i = 1}^{3} μ_{i} (t) = 1

2.3. Actuator Fault Model

In considering the actuator faults of an HGV, the following actuator fault is utilized:

u^{F} (t) = Γ_{F} u (t) - Λ_{F},

(3)

where

Γ_{F}

is the failure matrix, which represents the failure coefficient and is unknown, and

Λ_{F}

is the loss matrix, which denotes a bias fault, and is also unknown.

u (t)

is the normal control signal, and

Γ_{F} = [\begin{matrix} ρ_{δ_{e}} & 0 & 0 \\ 0 & ρ_{δ_{r}} & 0 \\ 0 & 0 & ρ_{δ_{a}} \end{matrix}], Λ_{F} = [\begin{matrix} χ_{δ_{e}} \\ χ_{δ_{r}} \\ χ_{δ_{a}} \end{matrix}]

Γ_{F}

and

Λ_{F}

have different values, which means that different fault cases occur.

ρ_{j}

(j = δ_{e}, δ_{r}, δ_{a})

represents the actual available proportion of the normal actuator,

χ_{j}

(j = δ_{e}, δ_{r}, δ_{a})

represents the actuator float variable caused by friction, piezoelectric effects, and so on.

χ_{j}

(j = δ_{e}, δ_{r}, δ_{a})

is a slowly changed value or is invariant; so it can be regarded as a constant in the controller design.

Expression (3) represents almost all fault models, more specifically:

(1) If

Γ_{F} = I

Λ_{F} = 0

, the actuator fault is in free fault mode;

(2) If

0 < Γ_{F} < I,

Λ_{F} = 0,

the actuator fault is loss of effectiveness;

(3) If

Γ_{F} = I,

Λ_{F} \neq 0,

the actuator fault is a drift fault;

(4) If

0 < Γ_{F} < I,

Λ_{F} \neq 0,

the actuator fault is a combined loss of effectiveness and drift fault fault.

Remark 1.

The faults described by (3) are a general form of HGV actuator faults. In the previous literature, one or more faults were considered, but there was no unified description of HGV actuator faults. This paper gives a general description as shown in (3) and then presents a controller design method according to this fault mode.

2.4. Control Objective

The problem mainly considered in this paper is the design of an attitude-tracking controller for an HGV, so the main objective of the optimal tracking problem is to seek the control policy

u (t)

, so as to make the system (2) track a desired trajectory

r_{d} (t)

, where

r_{d} (t) = {[\begin{matrix} α_{d}, & β_{d}, & γ_{v d}, & 0, & 0, & 0 \end{matrix}]}^{T}

. Then, defining the tracking error as

e (t) ≜ x (t) - r_{d} (t)

(4)

by (2) and (4),

e (t)

can be rewritten as:

(Rule i) If

α

α_{i}

, then

\dot{e} (t) = A_{i} e (t) + A_{i} {\dot{r}}_{d} (t) + B_{i} u^{F} (t)

(5)

where

{\dot{r}}_{d} (t)

is the the differential of the reference command

r_{d} (t)

. Then, the whole system is:

\begin{matrix} \dot{e} (t) & = & \overset{k}{\sum_{i = 1}} μ_{i} (t) (A_{i} e (t) + B_{i} u^{F} (t) + A_{i} {\dot{r}}_{d} (t)) \\ = & \bar{A} e (t) + \bar{B} u^{F} (t) + \bar{A} {\dot{r}}_{d} (t) \end{matrix}

(6)

where

{\dot{r}}_{d} (t)

is assumed to be bounded. Then, considering the actuator faults proposed in (3), the designed controller for (6) should track a command

r_{d} (t)

under unknown actuator faults

u^{F} (t)

3. Main Results

In this section, a PI-based fuzzy adaptive FTC strategy is proposed based on only a little information about the actuator faults.

3.1. PI-Based Normal Controller

For the T–S fuzzy model (2) without actuator faults (3), the comprehensive control performance index for the ith rule is

J_{i} (e (t), u_{i} (t), {\dot{r}}_{d} (t)) = \int_{t}^{\infty} u_{i}^{T} (τ) R u_{i} (τ) + e {(τ)}^{T} Q e (τ) - γ^{2} {\dot{r}}_{d} {(t)}^{T} {\dot{r}}_{d} (t) d τ,

(7)

where

R \in R^{3}

and

Q \in R^{6}

are given positive definite symmetric matrices, and

γ

is a prescribed constant that respects the performance of interference suppression for the designed controller. Defining the following value function for the ith rule:

V_{i} (e (t), u_{i} (t), {\dot{r}}_{d} (t)) = \int_{t}^{\infty} e {(τ)}^{T} Q e (τ) + u_{i} {(τ)}^{T} R_{i} u_{i} (τ) - γ^{2} {\dot{r}}_{d} {(t)}^{T} {\dot{r}}_{d} (t) d τ

(8)

then, the above value function can be viewed as a zero-sum game between the control policy

u_{i} (t)

and the derivation of the reference command

{\dot{r}}_{d} (t)

[28]. In this zero-sum game, the control policy

u_{i} (t)

wants to seek an optimal controller to minimize the performance index (7), while

{\dot{r}}_{d} (t)

wants to maximize it. According to the definition of a zero-sum game, we make the following definition of the Nash equilibrium.

Definition 1.

Nash equilibrium: The zero-sum game (8) has a unique Nash equilibrium if the following conditions are satisfied:

1. For each fixed control

u_{i} (t)

, there is always a unique

{\dot{r}}_{d} (t)

that can maximize

V_{i} (e (t), u_{i} (t), {\dot{r}}_{d} (t))

; which means that, there exists a

{\dot{r}}_{d}^{*} (t)

such that

V_{i} (e (t), u_{i} (t), {\dot{r}}_{d}^{*} (t)) \geq V_{i} (e (t), u_{i} (t), {\dot{r}}_{d} (t));

2. The optimal game solution

u_{i}^{*} (t)

can minimize

V_{i} (x, u, {\dot{r}}_{d}^{*} (t))

, which means that

V_{i} (e (t), u_{i} (t), {\dot{r}}_{d}^{*} (t)) \geq V_{i} (e (t), u_{i}^{*} (t), {\dot{r}}_{d}^{*} (t));

u_{i}^{*} (t)

is the optimal control input, and

{\dot{r}}_{d}^{*} (t)

is the optimal disturbance policy.

Remark 2.

In the above zero-sum game, we assume that the reference command

{\dot{r}}_{d} (t)

is a player and can be changed freely. In practice, the reference command must combine some flight requirements, but this assumption is more conducive to us obtaining the optimal strategy of the game. This assumption can also cope with various changes in actual instructions, so this assumption is necessary.

For the above game, according to the Bellman equation, we choose the Hamiltonian function for the ith rule to be

H_{i} (e (t), u_{i} (t), {\dot{r}}_{d} (t)) = e {(t)}^{T} Q e (t) + u_{i} {(t)}^{T} R_{i} u_{i} (t) - γ^{2} {\dot{r}}_{d} {(t)}^{T} {\dot{r}}_{d} (t) + \frac{\partial V_{i} (e (t))}{\partial e (t)}

For convenience of description, using the fuzzy state feedback controller, we use the following rule

(Rule i) If

α

α_{i}

, then

\begin{matrix} u_{i} (t) & = & - K_{i} e (t), \\ {\dot{r}}_{d} (t) & = & L_{i} e (t) \end{matrix}

(9)

to describe the control policy and derivative policy of the reference command. Then, the game value function for the ith rule can be respected as

V_{i} (e (t), u_{i} (t), {\dot{r}}_{d} (t)) = \int_{t}^{\infty} e {(τ)}^{T} P_{i} e (τ) d τ

where

P_{i}

is a solution of the following Algebraic Riccati Equation (10):

P_{i} (A_{i} + B_{i} K_{i}) + {(A_{i} + B_{i} K_{i})}^{T} P_{i} + Q + K_{i}^{T} R_{i} K_{i} + γ^{- 2} P_{i} A_{i} A_{i}^{T} P_{i} = 0

(10)

According to the Pontryagin minimum principle, the optimal control policy for (9) is

\begin{matrix} \frac{\partial H_{i} (e (t), u_{i}^{*} (t), {\dot{r}}_{d} (t))}{\partial u_{i}^{*} (t)} & = & 0 \\ \frac{\partial H_{i} (e (t), u_{i} (t), {\dot{r}}_{d}^{*} (t))}{\partial {\dot{r}}_{d}^{*} (t)} & = & 0 \end{matrix}

which implies that

\begin{matrix} u_{i}^{*} (t) & = & - R_{i}^{- 1} B_{i}^{T} P_{i}^{*} ζ (t) = - K_{i} e (t) \end{matrix}

(11)

\begin{matrix} {\dot{r}}_{d}^{*} (t) & = & γ^{- 2} A_{i}^{T} P_{i}^{*} ζ (t) = L_{i} e (t) \end{matrix}

(12)

where

u_{i}^{*} (t)

is the optimal control for the ith rule of the T–S model (2),

K_{i} = R_{i}^{- 1} B_{i}^{T} P_{i}^{*}

{\dot{r}}_{d}^{*} (t)

is the optimal derivation of the reference command, and

L_{i} = γ^{- 2} A_{i}^{T} P_{i}^{*}

With the above optimal control

u_{i}^{*} (t)

and

{\dot{r}}_{d}^{*} (t),

the optimal game value function for the ith rule is

V_{i}^{*} (e (t), u_{i}^{*} (t), {\dot{r}}_{d}^{*} (t)) = \int_{t}^{\infty} e {(τ)}^{T} P_{i}^{*} e (τ) d τ

where

P_{i}^{*}

is the optimal value of

P_{i}

. Then, for the T–S model of the HGV, the total controller is constructed as:

u (t) = - \overset{3}{\sum_{j = 1}} μ_{j} (t) K_{j} e (t)

Theorem 1.

For system (2) without actuator faults, if there exist positive-definite matrices

P_{i} > 0

, and for the ith rule of the T–S fuzzy system, under controller (11) and (12), satisfying

P_{i} (A_{i} + B_{i} K_{i}) + {(A_{i} + B_{i} K_{i})}^{T} P_{i} + Q + K_{i}^{T} R_{i} K_{i} + γ^{- 2} P_{i} A_{i} A_{i}^{T} P_{i} = 0

(13)

then, the T–S fuzzy system (2) is stable.

Proof.

The overall performance of (6) is

\begin{matrix} J_{T} & = & \int_{t}^{\infty} e {(τ)}^{T} P e (τ) d τ \\ = & \int_{t}^{\infty} u {(τ)}^{T} R u (τ) + e {(τ)}^{T} Q e (τ) d τ \end{matrix}

(14)

where P is a positive definite symmetric matrix for the overall system. Then,

\begin{matrix} {\dot{J}}_{T} & = & - u {(t)}^{T} R u (t) - e {(t)}^{T} Q e (t) \\ = & e^{T} (t) [\overset{3}{\sum_{i = 1}} μ_{i} (t) \overset{3}{\sum_{j = 1}} μ_{j} (t) P (A_{i} + B_{i} K_{j}) + {(A_{i} + B_{i} K_{j})}^{T} P] e (t) + γ^{2} {\dot{r}}_{d} {(t)}^{T} {\dot{r}}_{d} (t) \\ = & 2 e^{T} (t) [\overset{3}{\sum_{i = 1}} μ_{i} (t) \overset{3}{\sum_{j = 1}} μ_{j} (t) (P A_{i} - P B_{i} R^{- 1} B_{j}^{T} P_{j}) e (t)] + γ^{2} {\dot{r}}_{d} {(t)}^{T} {\dot{r}}_{d} (t) \end{matrix}

Since P and

P_{j}

are all positive definite symmetric matrices, considering the matrices

B_{i}

for controller design, there exist positive constants

ϱ_{M} \geq ϱ_{m} > 0

ϱ_{m} e^{T} (t) P_{j} B_{j} R^{- 1} B_{j}^{T} P_{j} e (t) \leq e^{T} (t) P B_{i} R^{- 1} B_{j}^{T} P_{j} e (t) \leq ϱ_{M} e^{T} (t) P_{j} B_{j} R^{- 1} B_{j}^{T} P_{j} e (t)

and positive definite symmetric matrix

M_{j}

, and positive constant

η > 0,

such that

P = η P_{j} M_{j} .

Then,

\begin{matrix} {\dot{J}}_{T} & = & 2 e^{T} (t) [\overset{3}{\sum_{j = 1}} μ_{j} (t) (η P_{j} {\bar{A}}_{j} - η P_{j} B_{i} R^{- 1} B_{j}^{T} P_{j}) e (t)] + γ^{2} {\dot{r}}_{d} {(t)}^{T} {\dot{r}}_{d} (t) \\ = & 2 η e^{T} (t) [\overset{3}{\sum_{j = 1}} μ_{j} (t) (P_{j} {\bar{A}}_{j} - P_{j} B_{i} R^{- 1} B_{j}^{T} P_{j}) e (t)] + γ^{2} {\dot{r}}_{d} {(t)}^{T} {\dot{r}}_{d} (t) \\ \leq & 2 η e^{T} (t) [\overset{3}{\sum_{j = 1}} μ_{j} (t) (P_{j} {\bar{A}}_{j} - ϱ_{M} P_{j} B_{j} R^{- 1} B_{j}^{T} P_{j}) e (t)] + γ^{2} {\dot{r}}_{d} {(t)}^{T} {\dot{r}}_{d} (t) \\ \leq & - η \int_{0}^{t} e^{T} (t) (\overset{3}{\sum_{i = 1}} μ_{j} (t) (Q + (2 ϱ_{M} - 1) K_{j}^{T} R K_{j})) e (τ) d τ + γ^{2} {\dot{r}}_{d} {(t)}^{T} {\dot{r}}_{d} (t) \end{matrix}

In this case,

{\dot{J}}_{T} < 0

ϱ_{M} > \frac{1}{2}

and

∥e (t)∥ > \sqrt{γ^{2} {\dot{r}}_{d} {(t)}^{T} {\dot{r}}_{d} (t) / λ_{\min} (\sum_{i = 1}^{3} μ_{j} (t) (Q + (2 ϱ_{M} - 1) K_{j}^{T} R K_{j}))} .

The proof process is completed. □

Based on Theorem 1, the online policy-iteration algorithm is as Algorithm 1:

Algorithm 1: Model-based PI algorithm

1. Initialization: Select

i = 0

, choose any reasonable policy

u_{i} (t) = K_{i}^{(0)} e (t)

;

2. Policy evaluation: Solve the following equations for

P_{i}^{(i + 1)}

P_{i}^{(i + 1)} (A_{i} + B_{i} K_{i}^{(i)}) + {(A_{i} + B_{i} K_{i}^{(i)})}^{T} P_{i}^{(i + 1)} + Q + K_{i}^{T (i)} R_{i} K_{i}^{(i)} + γ^{- 2} P_{i} A_{i} A_{i}^{T} P_{i} = 0

3. Policy improvement:

\begin{matrix} K_{i}^{(i + 1)} & = & - R_{i} {\bar{B}}_{i}^{T} P_{i}^{(i + 1)} \end{matrix}

(15)

\begin{matrix} L_{i}^{(i + 1)} & = & γ^{- 2} A_{i}^{T} P_{i}^{(i + 1)} \end{matrix}

(16)

4. If the convergence condition is satisfied, stop; else, go to step 2.

In Algorithm 1, the superscript

(i)

represents the ith iteration.

Remark 3.

In the proof of the above Theorem, overall performance

J_{T}

in (14) is not equal to the simple sum of

J_{i}

proposed in (7). This proof process for Theorem 1 has taken into account the PDC of fuzzy systems.

3.2. PI-Based Fuzzy FTC Control

Just as described in (3),

Γ_{F}

and

Λ_{F}

are unknown, but they are all bounded. Then, adaptive fuzzy FTC is constructed: for the T–S fuzzy model (6) with actuator faults (3), the fuzzy state feedback FTC controller is:

(Rule i) If

α

α_{i}

, then

u_{i} (t) = - K_{i} e (t) + {\hat{Γ}}_{F i}^{- 1} {\hat{Λ}}_{F i},

(17)

where

{\hat{Λ}}_{F i}

is the estimation of

Λ_{F}

at Rule i, and

{\hat{Γ}}_{F i}

is the estimation of

Γ_{F}

at Rule i.

Theorem 2.

For system (2) with actuator fault (3), if there exist positive-definite matrices

P_{i} > 0

, and for the ith rule of the T–S fuzzy system and a FTC controller

K_{i}

satisfying

P_{i} (A_{i} + B_{i} K_{i}) + {(A_{i} + B_{i} K_{i})}^{T} P_{i} + Q + K_{i}^{T} R_{i} K_{i} + γ^{- 2} P_{i} A_{i} A_{i}^{T} P_{i} = 0

(18)

where

\begin{matrix} K_{i} & = & - {\hat{Γ}}_{F i}^{- 1} R_{i} B_{i}^{T} P_{i}, \\ u_{i} (t) & = & K_{i} e (t) + {\hat{Γ}}_{F i}^{- 1} {\hat{Λ}}_{F i}, \end{matrix}

and the updating law

{\hat{Γ}}_{F i}

and

{\hat{Λ}}_{F i}

are

\begin{matrix} {\dot{\hat{Λ}}}_{F i} & = & 2 q_{1} B_{i}^{T} P_{i} e (t) \end{matrix}

(19)

\begin{matrix} {\dot{\hat{Γ}}}_{F i} & = & d i a g (2 q_{2} e {(t)}^{T} P_{i} B_{i} u_{i} (t)) \end{matrix}

(20)

where

q_{1}

and

q_{2}

are given constants, then system (2) is semi-globally uniformly ultimately bounded.

Proof.

Choosing a Lyapunov function for tracking system (6) as

V^{F} (t) = V_{1} (t) + V_{2} (t) + V_{3} (t),

where

\begin{matrix} V_{1} (t) & = & \int_{0}^{t} e {(τ)}^{T} P e (τ) d τ, \\ V_{2} (t) & = & \frac{1}{2 q_{1}} t r ({\tilde{Γ}}_{F i}^{T} {\tilde{Γ}}_{F i}), \\ V_{3} (t) & = & \frac{1}{2 q_{2}} {\tilde{Λ}}_{F i} {\tilde{Λ}}_{F i}, \end{matrix}

and

{\tilde{Γ}}_{F i} = Γ_{F i} - {\hat{Γ}}_{F i}, {\tilde{Λ}}_{F i} = Λ_{F i} - {\hat{Λ}}_{F i}

Then, considering the differential of the proposed Lyapunov function

\begin{matrix} {\dot{V}}_{1} (t) & = & 2 e^{T} (t) \overset{3}{\sum_{i = 1}} μ_{i} (t) \overset{3}{\sum_{j = 1}} μ_{j} (t) [P {\bar{A}}_{i} e (t) + P {\bar{B}}_{i} u_{j}^{F}] e (t) + γ^{2} {\dot{r}}_{d} {(t)}^{T} {\dot{r}}_{d} (t) \\ = & 2 e^{T} (t) \overset{3}{\sum_{i = 1}} μ_{i} (t) \overset{3}{\sum_{j = 1}} μ_{j} (t) [P {\bar{A}}_{i} e (t) + P {\bar{B}}_{i} (Γ_{F_{j}} u_{j} (t) - Λ_{F j})] + γ^{2} {\dot{r}}_{d} {(t)}^{T} {\dot{r}}_{d} (t) \\ = & 2 e^{T} (t) \overset{3}{\sum_{i = 1}} μ_{i} (t) \overset{3}{\sum_{j = 1}} μ_{j} (t) [P {\bar{A}}_{i} e (t) + P {\bar{B}}_{i} ((Γ_{F i} - {\hat{Γ}}_{F i}) u_{j} (t) + {\hat{Γ}}_{F i} u_{j} (t) - Λ_{F j})] \\ + γ^{2} {\dot{r}}_{d} {(t)}^{T} {\dot{r}}_{d} (t) \\ = & 2 e^{T} (t) [\overset{3}{\sum_{i = 1}} μ_{i} (t) \overset{3}{\sum_{j = 1}} μ_{j} (t) (P {\bar{A}}_{i} - P {\bar{B}}_{i} R^{- 1} B_{j}^{T} P_{j})] e (t) \\ + 2 e^{T} (t) \overset{3}{\sum_{i = 1}} μ_{i} (t) \overset{3}{\sum_{j = 1}} μ_{j} (t) [P {\bar{B}}_{i} ({\tilde{Γ}}_{F i} u_{j} (t) - {\tilde{Λ}}_{F i})] \\ + γ^{2} {\dot{r}}_{d} {(t)}^{T} {\dot{r}}_{d} (t) \end{matrix}

and considering the updating law of

{\hat{Γ}}_{F i}

and

{\hat{Λ}}_{F i},

{\dot{V}}_{2} (t) = - t r ({\tilde{Γ}}_{F i}^{T} {\hat{Γ}}_{F i}) .

{\dot{V}}_{3} (t) = - {\tilde{Λ}}_{F i} {\hat{Λ}}_{F i} .

and

\begin{matrix} e^{T} (t) \overset{3}{\sum_{i = 1}} μ_{i} (t) \overset{3}{\sum_{j = 1}} μ_{j} (t) [P {\bar{B}}_{i} ({\tilde{Γ}}_{F i} u_{j} (t) - {\tilde{Λ}}_{F i})] \\ = & e^{T} (t) P \overset{3}{\sum_{i = 1}} μ_{i} (t) \overset{3}{\sum_{j = 1}} μ_{j} (t) [{\bar{B}}_{i} ({\tilde{Γ}}_{F i} u_{j} (t) - {\tilde{Λ}}_{F i})] \end{matrix}

then,

\begin{matrix} {\dot{V}}_{1} (t) + {\dot{V}}_{2} (t) + {\dot{V}}_{3} (t) \\ \leq & - η \int_{0}^{t} e^{T} (t) (\overset{3}{\sum_{i = 1}} μ_{j} (t) (Q + (2 ϱ_{M} - 1) K_{j}^{T} R K_{j})) e (τ) d τ + γ^{2} {\dot{r}}_{d} {(t)}^{T} {\dot{r}}_{d} (t) \\ + 2 ϱ_{M}^{'} e^{T} (t) P_{j} B_{j} {\tilde{Γ}}_{F i} u_{j} (t) - t r ({\tilde{Γ}}_{F i}^{T} {\hat{Γ}}_{F i}) - 2 ϱ_{m}^{'} e^{T} (t) P_{j} B_{j} {\tilde{Λ}}_{F i} - {\tilde{Λ}}_{F i} {\hat{Λ}}_{F i} \\ \leq & - η \int_{0}^{t} e^{T} (t) (\overset{3}{\sum_{i = 1}} μ_{j} (t) (Q + (1 - 2 (1 - η)) K_{j}^{T} R K_{j})) e (τ) d τ + γ^{2} {\dot{r}}_{d} {(t)}^{T} {\dot{r}}_{d} (t) \end{matrix}

Based on the analysis of Theorem 1, the whole system is semi-globally uniformly ultimately bounded. □

3.3. IRL-Based Fuzzy FTC Control

For solving Theorem 2, all information on

A_{i}

and

B_{i}

are needed, and the ARE is difficult to solve. To reduce the dependence on the system model in the solution process, and give a more easily solved method, a new data-based PI fuzzy FTC controller is developed from Theorem 2 and Algorithm 1, in which the system matrices

A_{i}

can be unknown and only

B_{i}

are utilized for the proposed FTC [25].

The derivative of the ith value function is

\begin{matrix} {\dot{V}}_{i} (e (t), u_{i} (t), {\dot{r}}_{d} (t)) & = & e^{T} (t) P_{i} \dot{e} (t) \\ = & - e {(t)}^{T} Q e (t) - u_{i} {(t)}^{T} R_{i} u_{i} (t) + γ^{2} {\dot{r}}_{d} {(t)}^{T} {\dot{r}}_{d} (t) \end{matrix}

(21)

Integrating both sides of (21) from t to

t + T

, results in

e {(t + T)}^{T} P_{i}^{(i + 1)} e (t + T) - e {(t)}^{T} P_{i}^{(i + 1)} e (t) = - \int_{t}^{t + T} [e {(τ)}^{T} (Q + K_{i}^{(i) T} R_{i} K_{i}^{(i)}) e (τ) - γ^{2} {\dot{r}}_{d} {(t)}^{T} {\dot{r}}_{d} (t)] d τ .

Since the value of

e (t + T),

e (t)

and the integral of the right side can all be computed, the positive defined matrices

P_{i}

can be solved by the above equation.

Then, the following theorem is proposed (Algorithm 2):

Algorithm 2: IRL-based fuzzy FTC control algorithm

1. Initialization: Select

i = 0

, choose any reasonable policy

u_{i} (t) = K_{i}^{(0)} e (t)

P^{(i)} > 0

;

2. Policy evaluation: Solve the following equations for

P_{i}^{(i + 1)}

e {(t + T)}^{T} P_{i}^{(i + 1)} e (t + T) - e {(t)}^{T} P_{i}^{(i + 1)} e (t) = - \int_{0}^{t} [e {(τ)}^{T} (Q + K_{i}^{(i) T} R_{i} K_{i}^{(i)}) e (τ) - γ^{2} {\dot{r}}_{d} {(t)}^{T} {\dot{r}}_{d} (t)] d τ .

3. Policy improvement:

K_{i}^{(i + 1)} = - {\hat{Γ}}_{F i}^{- 1} R_{i} B_{i}^{T} P_{i}^{(i + 1)} .

and

\begin{matrix} {\hat{Λ}}_{F i} & = & 2 q_{1} B_{i}^{T} P_{i} e (t) \\ {\hat{Γ}}_{F i} & = & d i a g (2 q_{2} e {(t)}^{T} P_{i} B_{i} u_{i} (t)) \end{matrix}

4. If the convergence condition is satisfied, stop; else, go to step 2.

A flowchart of the proposed AFFTC is given in Figure 1. From Figure 1, one can see that, the desired trajectory

r_{d} (t)

is firstly added to system (1), and then, the tracking error

e (t)

is generated. After that, the adaptive parameter estimator updates the parameters according to (19) and (20), and the controller updates the controller according to (18). Finally, the overall updating of the fuzzy controller is completed.

Remark 4.

The proposed FTC method is an IRL-based adaptive fuzzy FTC controller. Although adaptive control methods can adjust the parameters of a controller, the efficiency of self-adaptation adjustment is too low for time-varying faults or sudden faults. Reinforcement learning can quickly obtain new control parameters according to the changes in HGV states. Combining reinforcement learning with self-adaptation, the proposed method can quickly adjust the parameters of the controller when the HGV has sudden or time-varying faults.

4. Simulation Results

According to the T–S approximation method proposed in Section 2, the fuzzy model of the HGV’s attitude system can be constructed as follows:

(Rule 1) If

α

α_{S}

, then

\dot{x} (t) = A_{1} x (t) + B_{1} u (t),

(Rule 2) If

α

α_{M}

, then

\dot{x} (t) = A_{2} x (t) + B_{2} u (t),

(Rule 3) If

α

α_{B}

, then

\dot{x} (t) = A_{3} x (t) + B_{3} u (t),

Then, based on the adaptive fuzzy FTC strategy given in Section 3.2, the FTC controller can be driven from different faults. The parameters and initial states of the proposed simulation are similar to ref. [5]. For testing the effectiveness of the proposed controller, three simulations under three different fault modes are given as follows:

Case I: loss of effectiveness. In this simulation, the actuator faults are

ρ_{F} = [\begin{matrix} 0.6 & 0 & 0 \\ 0 & 0.6 & 0 \\ 0 & 0 & 0.6 \end{matrix}], β_{F} = [\begin{matrix} 0 \\ 0 \\ 0 \end{matrix}] .

According to the proposed method, the loss of effectiveness fault is carried on the nonlinear attitude model of the HGV. The fault is set to occur 40 s after the simulation starts. For comparison, the T–S controller proposed in ref. [29], marked as

u_{T - S}

, is applied to the attitude control of the HGV, and the proposed adaptive fuzzy control is marked as

u_{F T C}

. The tracking simulation results are presented in Figure 2 and Figure 3. Specifically, the tracking results of the given command are presented in Figure 2, and the input is presented in Figure 3. From the simulation figures, we can see that, when a fault occurs, the tracking error of

u_{F T C}

is smaller than that of

u_{T - S}

, and the input of

u_{F T C}

is also smoother than that of

u_{T - S}

. So, the proposed adaptive fuzzy FTC can adjust itself and obtain a better performance.

Case II: Drift fault. In this simulation, the following faults are considered:

ρ_{F} = [\begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix}], β_{F} = [\begin{matrix} 0.1 \\ 0.1 \\ 0.1 \end{matrix}] .

The fault is set to occur 40 s after the simulation starts. The tracking simulation results are presented in Figure 4 and Figure 5. Specifically, the tracking results of the given command are presented in Figure 4, and the input is presented in Figure 5. From the simulation figures, we can see that, under drift fault, the proposed adaptive fuzzy FTC can also guarantee the tracking performance of the HGV, and the smoothness of the input under

u_{F T C}

is better than that under

u_{T - S}

Case III: Combined fault. In this simulation, the following faults are considered:

ρ_{F} = [\begin{matrix} 0.6 & 0 & 0 \\ 0 & 0.6 & 0 \\ 0 & 0 & 0.6 \end{matrix}], β_{F} = [\begin{matrix} 0.1 \\ 0.1 \\ 0.1 \end{matrix}] .

The faults are set to occur 40 s after the simulation starts. The tracking simulation results are presented in Figure 6 and Figure 7. Specifically, the tracking results of the given command are presented in Figure 6, and the input to the HGV is presented in Figure 7. Under the combined fault, the input of

u_{F T C}

and

u_{T - S}

are all non-smooth, but the oscillation amplitude of

u_{F T C}

is much smaller than that of

u_{T - S}

5. Conclusions

FTC of an HGV’s attitude control system is discussed in this paper. Actuator faults are considered and the mode of the faults is the general model of the actuator. Based on the T–S fuzzy approach, a nonlinear attitude model is firstly modeled using a T–S fuzzy model, and then, a normal T–S controller without considering the actuator fault is designed utilizing RL technology. Then, based on the normal fuzzy controller, an improved adaptive FTC is designed in which the FTC can be adjusted online according to the failure mode. Finally, simulation results on different kinds of actuator faults are given to show the good performance of the proposed FTC.

The simulation results given in this paper are based on a mathematical model. Future work will consist of engineering the proposed algorithm and verification of the proposed method by combining the algorithm with an actual HGV.

Author Contributions

Conceptualization, C.H. and M.L.; methodology, X.H.; software, M.L.; validation, C.H., H.L. and M.L.; investigation, X.H.; resources, C.H.; data curation, H.P. and H.L.; writing—original draft preparation, M.L.; writing—review and editing, H.P. and C.H.; visualization, X.H.; supervision, X.H.; project administration, H.L.; funding acquisition, C.H. and X.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially supported by National Natural Science Foundation of China under Grant 62073265.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Shao, X.; Wang, H. Active disturbance rejection based trajectory linearization control for hypersonic reentry vehicle with bounded uncertainties. ISA Trans. 2015, 54, 27–38. [Google Scholar] [CrossRef] [PubMed]
Zhao, S.; Wang, J.; Xu, H.; Wang, B. Composite observer-based optimal attitude-tracking control with reinforcement learning for hypersonic vehicles. IEEE Trans. Cybern. 2022, 53, 913–926. [Google Scholar] [CrossRef]
Wang, Y.; Chen, M.; Wu, Q.; Zhang, J. Fuzzy adaptive non-affine attitude tracking control for a generic hypersonic flight vehicle. Aerosp. Sci. Technol. 2018, 80, 56–66. [Google Scholar] [CrossRef]
Huang, Y.; Sun, C.; Qian, C.; Wang, L. Linear parameter varying switching attitude control for a near space hypersonic vehicle with parametric uncertainties. Int. J. Syst. Sci. 2015, 46, 3019–3031. [Google Scholar] [CrossRef]
Yu, X.; Li, P.; Zhang, Y. Fixed-time actuator fault accommodation applied to hypersonic gliding vehicles. IEEE Trans. Autom. Sci. Eng. 2020, 18, 1429–1440. [Google Scholar] [CrossRef]
Chao, D.; Qi, R.; Jiang, B. Adaptive fault-tolerant attitude control for hypersonic reentry vehicle subject to complex uncertainties. J. Frankl. Inst. 2022, 359, 5458–5487. [Google Scholar] [CrossRef]
Lv, J.; Wang, C.; Kao, Y. Adaptive fixed-time quantized fault-tolerant attitude control for hypersonic reentry vehicle. Neurocomputing 2023, 520, 386–399. [Google Scholar] [CrossRef]
Wang, L.; Meng, Y.; Hu, S.; Peng, Z.; Shi, W. Adaptive fault-tolerant attitude tracking control for hypersonic vehicle with unknown inertial matrix and states constraints. IET Control Theory Appl. 2023, 17, 1397–1412. [Google Scholar] [CrossRef]
Hu, Q.; Wang, C.; Li, Y.; Huang, J. Adaptive control for hypersonic vehicles with time-varying faults. IEEE Trans. Aerosp. Electron. Syst. 2018, 54, 1442–1455. [Google Scholar] [CrossRef]
Li, P.; Yu, X.; Zhang, Y.; Peng, X. Adaptive multivariable integral TSMC of a hypersonic gliding vehicle with actuator faults and model uncertainties. IEEE ASME Trans. Mechatron. 2007, 22, 2723–2735. [Google Scholar] [CrossRef]
Li, A.; Liu, S.; Hu, X. Fault-tolerant Attitude Control for Hypersonic Flight Vehicle Subject to Actuators Constrain: A Model Predictive Static Programming Approach. IEEE J. Miniaturization Air Space Syst. 2023, 4, 136–145. [Google Scholar] [CrossRef]
Wang, L.; Liu, F.; Yu, J.; Li, P.; Zhang, R.; Gao, F. Iterative learning fault-tolerant control for injection molding processes against actuator faults. J. Process Control 2017, 59, 59–72. [Google Scholar] [CrossRef]
Poongodi, T.; Saravanakumar, T.; Mishra, P.P.; Zhu, Q. Extended Dissipative Control for Markovian Jump Time-Delayed Systems with Bounded Disturbances. Math. Probl. Eng. 2020, 2020, 5685324. [Google Scholar] [CrossRef]
Wang, L.; Li, H.; Li, H.; Zhang, R.; Gao, F. Constrained model predictive fault-tolerant control for nonlinear batch processes with time delay by integrating a LRF method and a switching strategy. Chem. Eng. Sci. 2024, 287, 119762. [Google Scholar] [CrossRef]
Qiu, J.; Ji, W.; Chadli, M. A novel fuzzy output feedback dynamic sliding mode controller design for two-dimensional nonlinear systems. IEEE Trans. Fuzzy Syst. 2020, 29, 2869–2877. [Google Scholar] [CrossRef]
Zeng, Y.; Lam, H.K.; Xiao, B.; Wu, L. Tracking Control for Nonlinear Systems with Actuator Saturation via Interval Type-2 TS Fuzzy Framework. IEEE Trans. Cybern. 2022, 53, 7085–7094. [Google Scholar] [CrossRef]
Ren, B.; Karimi, H.R.; Yin, T.; Fu, S. Asynchronous H filtering for semi-Markov jump TS fuzzy systems within partial state delay and deception attack: Applied to aircraft-pilot state estimation. J. Frankl. Inst. 2022, 360, 9265–9289. [Google Scholar] [CrossRef]
Wang, J.; Hu, L.; Chen, F.; Wen, C. Multiple-step fault estimation for interval type-II TS fuzzy system of hypersonic vehicle with time-varying elevator faults. Int. J. Adv. Robot. Syst. 2017, 14, 1729881417699149. [Google Scholar] [CrossRef]
Li, Y.; Gao, W.; Huang, S.; Wang, R.; Yan, W.; Gevorgian, V.; Gao, D.W. Data-driven optimal control strategy for virtual synchronous generator via deep reinforcement learning approach. J. Mod. Power Syst. Clean Energy 2021, 9, 919–929. [Google Scholar] [CrossRef]
Wei, Q.; Li, H.; Yang, X.; He, H. Continuous-Time Distributed Policy Iteration for Multicontroller Nonlinear Systems. IEEE Trans. Cybern. 2021, 51, 2372–2383. [Google Scholar] [CrossRef]
Luo, B.; Liu, D.; Huang, T.; Liu, J. Output Tracking Control Based on Adaptive Dynamic Programming with Multistep Policy Evaluation. IEEE Trans. Syst. Man Cybern. Syst. 2019, 49, 2155–2165. [Google Scholar] [CrossRef]
Li, H.; Wu, Y.; Chen, M. Adaptive Fault-Tolerant Tracking Control for Discrete-Time Multiagent Systems via Reinforcement Learning Algorithm. IEEE Trans. Cybern. 2021, 51, 1163–1174. [Google Scholar] [CrossRef]
Zhang, L.; Zhang, R.; Wu, T.; Weng, R.; Han, M.; Zhao, Y. Safe Reinforcement Learning With Stability Guarantee for Motion Planning of Autonomous Vehicles. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 5435–5444. [Google Scholar] [CrossRef]
Zhong, W.; Wang, M.; Wei, Q.; Lu, J. A new neuro-optimal nonlinear tracking control method via integral reinforcement learning with applications to nuclear systems. Neurocomputing 2022, 483, 361–369. [Google Scholar] [CrossRef]
Wang, G.; Luo, B.; Xue, S. Integral reinforcement learning-based optimal output feedback control for linear continuous-time systems with input delay. Neurocomputing 2021, 460, 31–38. [Google Scholar] [CrossRef]
Chen, Q.; Jin, Y.; Song, Y. Fault-tolerant adaptive tracking control of Euler-Lagrange systems—An echo state network approach driven by reinforcement learning. Neurocomputing 2022, 484, 109–116. [Google Scholar] [CrossRef]
Wang, L.; Li, X.; Zhang, R.; Gao, F. Reinforcement Learning-Based Optimal Fault-Tolerant Tracking Control of Industrial Processes. Ind. Eng. Chem. Res. 2023, 62, 16014–16024. [Google Scholar] [CrossRef]
Wang, L.; Jia, L.; Zhang, R.; Gao, F. H∞ output feedback fault-tolerant control of industrial processes based on zero-sum games and off-policy Q-learning. Comput. Chem. Eng. 2023, 179, 108421. [Google Scholar] [CrossRef]
Hu, X.; Wu, L.; Hu, C.; Gao, H. Fuzzy guaranteed cost tracking control for a flexible air-breathing hypersonic vehicle. IET Control Theory Appl. 2012, 6, 1238–1249. [Google Scholar] [CrossRef]

Figure 1. Flowchart of the proposed method.

Figure 2. Tracking performance: Case I.

Figure 3. Input of HGV: Case I.

Figure 4. Tracking performance: Case II.

Figure 5. Input of HGV: Case II.

Figure 6. Tracking performance: Case III.

Figure 7. Input of HGV: Case III.

Table 1. Fuzzy Rules.

Rule NO.	Premise Variables $α$
1	S
2	M
3	B

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, M.; Hu, C.; Pei, H.; Li, H.; Hu, X. Adaptive Fuzzy Fault-Tolerant Attitude Control for a Hypersonic Gliding Vehicle: A Policy-Iteration Approach. Actuators 2024, 13, 259. https://doi.org/10.3390/act13070259

AMA Style

Liu M, Hu C, Pei H, Li H, Hu X. Adaptive Fuzzy Fault-Tolerant Attitude Control for a Hypersonic Gliding Vehicle: A Policy-Iteration Approach. Actuators. 2024; 13(7):259. https://doi.org/10.3390/act13070259

Chicago/Turabian Style

Liu, Meijie, Changhua Hu, Hong Pei, Hongzeng Li, and Xiaoxiang Hu. 2024. "Adaptive Fuzzy Fault-Tolerant Attitude Control for a Hypersonic Gliding Vehicle: A Policy-Iteration Approach" Actuators 13, no. 7: 259. https://doi.org/10.3390/act13070259

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Adaptive Fuzzy Fault-Tolerant Attitude Control for a Hypersonic Gliding Vehicle: A Policy-Iteration Approach

Abstract

1. Introduction

2. Problem Description

2.1. Nonlinear Model of an HGV’s Attitude System

2.2. T–S Fuzzy Modeling of an HGV

2.3. Actuator Fault Model

2.4. Control Objective

3. Main Results

3.1. PI-Based Normal Controller

3.2. PI-Based Fuzzy FTC Control

3.3. IRL-Based Fuzzy FTC Control

4. Simulation Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI