共查询到20条相似文献,搜索用时 31 毫秒
1.
Reinforcement learning is one of the fastest growing areas in machine learning, and has obtained great achievements in
biomedicine, Internet of Things (IoT), logistics, robotic control, etc. However, there are still many challenges for engineering
applications, such as how to speed up the learning process, how to balance the trade-off between exploration and exploitation.
Quantum technology, which can solve complex problems faster than classical methods, especially in supercomputers,
provides us a new paradigm to overcome these challenges in reinforcement learning. In this paper, a quantum-enhanced
reinforcement learning is pictured for optimal control. In this algorithm, the states and actions of reinforcement learning
are quantized by quantum technology. And then, a probability amplification method, which can effectively avoid the
trade-off between exploration and exploitation via quantized technology, is presented. Finally, the optimal control policy is
learnt during the process of reinforcement learning. The performance of this quantized algorithm is demonstrated in both
MountainCar reinforcement learning environment and CartPole reinforcement learning environment—one kind of classical
control reinforcement learning environment in the OpenAI Gym. The preliminary study results validate that, compared with
Q-learning, this quantized reinforcement learning method has better control performance without considering the trade-off
between exploration and exploitation. The learning performance of this new algorithm is stable with different learning rates
from 0.01 to 0.10, which means it is promising to be employed in unknown dynamics systems. 相似文献
2.
Given a collection of parameterized multi-robot controllers associated with individual behaviors designed for particular
tasks, this paper considers the problem of how to sequence and instantiate the behaviors for the purpose of completing a
more complex, overarching mission. In addition, uncertainties about the environment or even the mission specifications
may require the robots to learn, in a cooperative manner, how best to sequence the behaviors. In this paper, we approach this
problem by using reinforcement learning to approximate the solution to the computationally intractable sequencing problem,
combined with an online gradient descent approach to selecting the individual behavior parameters, while the transitions
among behaviors are triggered automatically when the behaviors have reached a desired performance level relative to a task
performance cost. To illustrate the effectiveness of the proposed method, it is implemented on a team of differential-drive
robots for solving two different missions, namely, convoy protection and object manipulation. 相似文献
3.
This study concentrates on solving the output consensus problem for a class of heterogeneous uncertain nonstrict-feedback
nonlinear multi-agent systems under switching-directed communication topologies, in which all followers are subjected to
multi-type input constraints such as unknown asymmetric saturation, unknown dead-zone and their integration. A unified
representation is presented to overcome the difficulties originating from multi-agent input constraints. Moreover, the uncertain
system functions in a non-lower triangular form and the interaction terms among agents are dealt with by exploiting
the fuzzy logic systems and their special property. Furthermore, by introducing a nonlinear filter to alleviate the problem of
“explosion of complexity” during the backstepping design, a distributed common adaptive control protocol is proposed to
ensure that the synchronization errors converge to a small neighborhood of the origin despite the existence of multiple input
constraints and arbitrary switching communication topologies. Both stability analysis and simulation results are conducted
to show the effectiveness and performance of the proposed control methodology. 相似文献
4.
Zachary Freudenburg Khaterah Kohneshin Erik Aarnoutse Mariska Vansteensel Mariana Branco Sacha Leinders Max van den Boom Elmar G. M. Pels Nick Ramsey 《控制理论与应用(英文版)》2021,19(4):444-454
While brain computer interfaces (BCIs) offer the potential of allowing those suffering from loss of muscle control to once
again fully engage with their environment by bypassing the affected motor system and decoding user intentions directly from
brain activity, they are prone to errors. One possible avenue for BCI performance improvement is to detect when the BCI user
perceives the BCI to have made an unintended action and thus take corrective actions. Error-related potentials (ErrPs) are
neural correlates of error awareness and as such can provide an indication of when a BCI system is not performing according
to the user’s intentions. Here, we investigate the brain signals of an implanted BCI user suffering from locked-in syndrome
(LIS) due to late-stage ALS that prevents her from being able to speak or move but not from using her BCI at home on a
daily basis to communicate, for the presence of error-related signals. We first establish the presence of an ErrP originating
from the dorsolateral pre-frontal cortex (dLPFC) in response to errors made during a discrete feedback task that mimics the
click-based spelling software she uses to communicate. Then, we show that this ErrP can also be elicited by cursor movement
errors in a continuous BCI cursor control task. This work represents a first step toward detecting ErrPs during the daily
home use of a communications BCI. 相似文献
5.
An extended state observer (ESO)-based loop flter is designed for the phase-locked loop (PLL) involved in a disturbed gridconnected converter (GcC). This ESO-based design enhances the performances and robustness of the PLL, and, therefore, improves control performances of the disturbed GcCs. Besides, the ESO-based LF can be applied to PLLs with extra flters for abnormal grid conditions. The unbalanced grid is particularly taken into account for the performance analysis. A tuning approach based on the well-designed PI controller is discussed, which results in a fair comparison with conventional PItype PLLs. The frequency domain properties are quantitatively analysed with respect to the control stability and the noises rejection. The frequency domain analysis and simulation results suggest that the performances of the generated ESO-based
controllers are comparable to those of the PI control at low frequency, while have better ability to attenuate high-frequency measurement noises. The phase margin decreases slightly, but remains acceptable. Finally, experimental tests are conducted with a hybrid power hardware-in-the-loop benchmark, in which balanced/unbalanced cases are both explored. The obtained results prove the efectiveness of ESO-based PLLs when applied to the disturbed GcC. 相似文献
6.
Model predictive control (MPC) is an optimal control method that predicts the future states of the system being controlled and
estimates the optimal control inputs that drive the predicted states to the required reference. The computations of the MPC
are performed at pre-determined sample instances over a finite time horizon. The number of sample instances and the horizon
length determine the performance of the MPC and its computational cost. A long horizon with a large sample count allows
the MPC to better estimate the inputs when the states have rapid changes over time, which results in better performance but
at the expense of high computational cost. However, this long horizon is not always necessary, especially for slowly-varying
states. In this case, a short horizon with less sample count is preferable as the same MPC performance can be obtained but at a
fraction of the computational cost. In this paper,we propose an adaptive regression-based MPC that predicts the bestminimum
horizon length and the sample count from several features extracted from the time-varying changes of the states. The proposed
technique builds a synthetic dataset using the system model and utilizes the dataset to train a support vector regressor that
performs the prediction. The proposed technique is experimentally compared with several state-of-the-art techniques on both
linear and non-linear models. The proposed technique shows a superior reduction in computational time with a reduction of
about 35–65% compared with the other techniques without introducing a noticeable loss in performance. 相似文献
7.
A pneumatic actuator is a fast and economical tool that converts compressed air into mechanical motion. In this paper, an extended state observer (ESO)-based sliding mode controller (SMC) is developed to adjust the air pressure of the actuator for accurate position control. Specifcally, an impedance control module is established to produce desired air pressure based on the relationship between forces and desired positions. Then, the ESO-based SMC is implemented to adjust the air pressure to the required level despite the presence of system uncertainties and disturbances. As a result, the position of the actuator is controlled to a setpoint through the regulation of pressure. The performance of ESO-based SMC is compared with that of a classic active disturbance rejection controller (ADRC) and a SMC. Simulation results demonstrate that the ESO-based SMC shows comparable performance to ADRC in terms of precise pressure control. In addition, it requires the least control efort necessary to excite valves among the three controllers. The stability of ESO-based SMC is theoretically justifed through Lyapunov approach. 相似文献
8.
This paper deals with the dynamic output feedback stabilization problem of deterministic finite automata (DFA). The static
form of this problem is defined and solved in previous studies via a set of equivalent conditions. In this paper, the dynamic
output feedback (DOF) stabilization of DFAs is defined in which the controller is supposed to be another DFA. The DFA
controller will be designed to stabilize the equilibrium point of the main DFA through a set of proposed equivalent conditions.
It has been proven that the design problem of DOF stabilization is more feasible than the static output feedback (SOF)
stabilization. Three simulation examples are provided to illustrate the results of this paper in more details. The first example
considers an instance DFA and develops SOF and DOF controllers for it. The example explains the concepts of the DOF
controller and how it will be implemented in the closed-loop DFA. In the second example, a special DFA is provided in
which the DOF stabilization is feasible, whereas the SOF stabilization is not. The final example compares the feasibility
performance of the SOF and DOF stabilizations through applying them to one hundred random-generated DFAs. The results
reveal the superiority of the DOF stabilization. 相似文献
9.
In this paper, a data-driven method for disturbance estimation and rejection is presented. The proposed approach is divided into two stages: an inner stabilization loop, to set the desired reference model, together with an outer loop for disturbance estimation and compensation. Inspired by the active disturbance rejection control framework, the exogenous and endogenous disturbances are lumped into a total disturbance signal. This signal is estimated using an on-line algorithm based on a datadriven predictor scheme, whose parameters are chosen to satisfy high robustness-performance criteria. The above process is presented as a novel enhancement to design a disturbance observer, which constitutes the main contribution of the paper. In addition, the control strategy is completely presented in discrete time, avoiding the use of discretization methods for its
digital implementation. As a case study, the voltage control of a DC-DC synchronous buck converter afected by disturbances in the input voltage and the load is considered. Finally, experimental results that validate the proposed strategy and some comparisons with the classical disturbance observer-based control are presented. 相似文献
10.
In recent years, cyber attacks have posed great challenges to the development of cyber-physical systems. It is of great
significance to study secure state estimation methods to ensure the safe and stable operation of the system. This paper
proposes a secure state estimation for multi-input and multi-output continuous-time linear cyber-physical systems with sparse
actuator and sensor attacks. First, for sparse sensor attacks, we propose an adaptive switching mechanism to mitigate the
impact of sparse sensor attacks by filtering out their attack modes. Second, an unknown input sliding mode observer is
designed to not only observe the system states, sensor attack signals, and measurement noise present in the system but also
counteract the effects of sparse actuator attacks through an unknown input matrix. Finally, for the design of an unknown
input sliding mode state observer, the feasibility of the observing system is demonstrated by means of Lyapunov functions.
Additionally, simulation experiments are conducted to show the effectiveness of this method. 相似文献
11.
Behzad Farzanegan Mohsen Zamani Amir Abolfazl Suratgar Mohammad Bagher Menhaj 《控制理论与应用(英文版)》2021,19(2):283-294
In this study, an adaptive neuro-observer-based optimal control (ANOPC) policy is introduced for unknown nonaffine
nonlinear systems with control input constraints. Hamilton–Jacobi–Bellman (HJB) framework is employed to minimize a
non-quadratic cost function corresponding to the constrained control input. ANOPC consists of both analytical and algebraic
parts. In the analytical part, first, an observer-based neural network (NN) approximates uncertain system dynamics,
and then another NN structure solves the HJB equation. In the algebraic part, the optimal control input that does not exceed
the saturation bounds is generated. The weights of two NNs associated with observer and controller are simultaneously
updated in an online manner. The ultimately uniformly boundedness (UUB) of all signals of the whole closed-loop system
is ensured through Lyapunov’s direct method. Finally, two numerical examples are provided to confirm the effectiveness of
the proposed control strategy. 相似文献
12.
We examine three simple linear systems from the viewpoint of ergodic theory. We digitize the output and record only the
sign of the output at integer times. We show that even with this minimal output we can recover important information about
the systems. In particular, for a two-dimensional system viewed as a flow on the circle, we can determine the rate of rotation.
We then use these results to determine the slope of the trajectories for constant irrational flow on the two-dimensional
torus. To achieve this, we randomize the system by partitioning the state space and only recording which partition the state
is in at each integer time. We show directly that these systems have entropy zero. Finally, we examine two four-dimensional
systems and reduce them to the study of linear flows on the two-dimensional torus. 相似文献
13.
Liangcheng Cai 《控制理论与应用(英文版)》2021,19(2):227-235
In this paper, the asymptotic stability of Port-Hamiltonian (PH) systems with constant inputs is studied. Constant inputs
are useful for stabilizing systems at their nonzero equilibria and can be realized by step signals. To achieve this goal, two
methods based on integral action and comparison principle are presented in this paper. These methods change the convex
Hamiltonian function and the restricted damping matrix of the previous results into a Hamiltonian function with a local
minimum and a positive semidefinite matrix, respectively. Due to common conditions of Hamiltonian function and damping
matrix, the proposed method asymptotically stabilizes more classes of PH systems with constant inputs than the existing
methods. Finally, the validity and advantages of the presented methods are shown in an example. 相似文献
14.
This paper discusses the problem of global state regulation via output feedback for a class of feedforward nonlinear time-delay
systems with unknown measurement sensitivity. Different from previous works, the nonlinear terms are dominated by upper
triangular linear unmeasured (delayed) states multiplied by unknown growth rate. The unknown growth rate is composed of
an unknown constant, a power function of output, and an input function. Furthermore, due to the measurement uncertainty
of the system output, it is more difficult to solve this problem. It is proved that the presented output feedback controller can
globally regulate all states of the nonlinear systems using the dynamic gain scaling technique and choosing the appropriate
Lyapunov–Krasovskii functionals. 相似文献
15.
Actuator faults usually cause security problem in practice. This paper is concerned with the security control of positive
semi-Markovian jump systems with actuator faults. The considered systems are with mode transition-dependent sojourntime
distributions, which may also lead to actuator faults. First, the time-varying and bounded transition rate that satisfies
the mode transition-dependent sojourn-time distribution is considered. Then, a stochastic co-positive Lyapunov function is
constructed. Using matrix decomposition technique, a set of state-feedback controllers for positive semi-Markovian jump
systems with actuator faults are designed in terms of linear programming. Under the designed controllers, stochastic stabilization
of the systems with actuator faults are achieved and the security of the systems can be guaranteed. Furthermore, the
proposed results are extended to positive semi-Markovian jump systems with interval and polytopic uncertainties. By virtue
of a segmentation technique of the transition rates, a less conservative security control design is also proposed. Finally,
numerical examples are provided to demonstrate the validity of the presented results. 相似文献
16.
In this paper, we presented the development of a navigation control system for a sailboat based on spiking neural networks
(SNN). Our inspiration for this choice of network lies in their potential to achieve fast and low-energy computing on specialized
hardware. To train our system, we use the modulated spike time-dependent plasticity reinforcement learning rule and a
simulation environment based on the BindsNET library and USVSim simulator. Our objective was to develop a spiking
neural network-based control systems that can learn policies allowing sailboats to navigate between two points by following
a straight line or performing tacking and gybing strategies, depending on the sailing scenario conditions. We presented the
mathematical definition of the problem, the operation scheme of the simulation environment, the spiking neural network
controllers, and the control strategy used. As a result, we obtained 425 SNN-based controllers that completed the proposed
navigation task, indicating that the simulation environment and the implemented control strategy work effectively. Finally,
we compare the behavior of our best controller with other algorithms and present some possible strategies to improve its
performance. 相似文献
17.
This paper presents an in-depth analytical and empirical assessment of the performance of DoubleBee, a novel hybrid aerial–
ground robot. Particularly, the dynamic model of the robot with ground contact is analyzed, and the unknown parameters in
the model are identified. We apply an unscented Kalman filter-based approach and a least square-based approach to estimate
the parameters with given measurements and inputs at every time step. Real data are collected and used to estimate the
parameters; test data verify that the values obtained are able to model the rotation of the robot accurately. A gain-scheduled
feedback controller is proposed, which leverages the identified model to generate accurate control inputs to drive the system
to the desired states. The system is proven to track a constant-velocity reference signal with bounded error. Simulations and
real-world experiments using the proposed controller show improved performance than the PID-based controller in tracking
step commands and maintaining attitude under robot movement. 相似文献
18.
This paper considers distributed state estimation of continuous-time linear system monitored by a network of multiple sensors. Each sensor can only access locally partial measurement output of the system and effectively communicates with its neighbors to cooperatively achieve the asymptotic estimation of the target full system state. For a constructive design, we shall incorporate the concept of system immersions and propose a class of distributed tracking observers for the problem under a reasonable condition of the locally joint observability. Moreover, as a direct application of the proposed observer design, we further present an interesting leader-following consensus design for multi-agent system. 相似文献
19.
This paper addresses the state estimation for a class of nonlinear time-varying stochastic systems with both uncertain dynamics and unknown measurement bias. A novel extended state based Kalman flter (ESKF) algorithm is developed to estimate the original state, the uncertain dynamics and the measurement bias. It is shown that the estimation error of the proposed algorithm is bounded in the mean square sense. Also, the estimation of the measurement bias asymptotically converges to its true value, such that the infuence of measurement bias is eliminated. Furthermore, the asymptotic optimality of the estimation result is proved while the uncertain dynamics approaches to a constant vector. Finally, a simulation study for harmonic oscillator system model is provided to illustrate the efectiveness of proposed method. 相似文献
20.
Rotating stall and surge are two violent unstable phenomena of an aero-engine compressor. The early detection of rotating stall is a critical and difficult issue in the operation of a compressor. Recently, a deterministic learning based stall inception detection approach (SIDA) has been developed for modeling and detecting stall inception in aero-engine compressors. This paper considers the derivation of analytical results on the detection capabilities for the SIDA based on deterministic learning. First, by utilizing the input/output stability of the residual system, a detectability condition of the SIDA is presented, and how to choose the parameters of the diagnostic system is also analyzed. Second, based on the relationship between NN approximation capabilities and radial basis function (RBF) network structures, the influence of RBF network structures on the performance properties of the SIDA is analyzed. Finally, a simulation study is presented, in which the Mansoux-C2 compressor model is utilized to verify the effectiveness of the proposed SIDA. 相似文献