首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
RRL is a relational reinforcement learning system based on Q-learning in relational state-action spaces. It aims to enable agents to learn how to act in an environment that has no natural representation as a tuple of constants. For relational reinforcement learning, the learning algorithm used to approximate the mapping between state-action pairs and their so called Q(uality)-value has to be very reliable, and it has to be able to handle the relational representation of state-action pairs. In this paper we investigate the use of Gaussian processes to approximate the Q-values of state-action pairs. In order to employ Gaussian processes in a relational setting we propose graph kernels as a covariance function between state-action pairs. The standard prediction mechanism for Gaussian processes requires a matrix inversion which can become unstable when the kernel matrix has low rank. These instabilities can be avoided by employing QR-factorization. This leads to better and more stable performance of the algorithm and a more efficient incremental update mechanism. Experiments conducted in the blocks world and with the Tetris game show that Gaussian processes with graph kernels can compete with, and often improve on, regression trees and instance based regression as a generalization algorithm for RRL. Editors: David Page and Akihiro Yamamoto  相似文献   

In this paper we introduce and illustrate non-trivial upper and lower bounds on the learning curves for one-dimensional Guassian Processes. The analysis is carried out emphasising the effects induced on the bounds by the smoothness of the random process described by the Modified Bessel and the Squared Exponential covariance functions. We present an explanation of the early, linearly-decreasing behavior of the learning curves and the bounds as well as a study of the asymptotic behavior of the curves. The effects of the noise level and the lengthscale on the tightness of the bounds are also discussed.  相似文献   

The least-squares policy iteration approach works efficiently in value function approximation, given appropriate basis functions. Because of its smoothness, the Gaussian kernel is a popular and useful choice as a basis function. However, it does not allow for discontinuity which typically arises in real-world reinforcement learning tasks. In this paper, we propose a new basis function based on geodesic Gaussian kernels, which exploits the non-linear manifold structure induced by the Markov decision processes. The usefulness of the proposed method is successfully demonstrated in simulated robot arm control and Khepera robot navigation.
Sethu VijayakumarEmail:

杨文浩  李小曼 《计算机应用》2016,36(5):1383-1386
针对单高斯背景模型不能适应非平稳场景且对初期保持静止后期运动的物体造成"鬼影"现象的问题,提出了融合子块梯度与线性预测的单高斯背景建模方法。首先,对每个像素点进行单高斯背景建模,并实现像素级的自适应更新,运用子块梯度算法将梯度在阈值内的子块作为背景以消除"鬼影";然后,将子块梯度法获得的前景与单高斯模型确定的前景做与运算,提高在非平稳场景下对背景的判断能力;最后,运用线性预测方法处理获得的前景点,将面积小于阈值的连通区域还原为背景。采用CDNET 2012 Dataset和Wallflower Dataset进行仿真实验:当场景变化幅度较大时,所提算法与混合高斯模型(GMM)相比,虽然检测率稍有下降,但检测精度提高了40%;在其他场景中检测率虽只提高约10%,检测精度却能提高25%以上。实验结果表明,融合子块梯度与线性预测的单高斯背景建模能够适应非平稳场景并消除"鬼影"现象,获得的背景比混合高斯模型更精确,提取的前景细节更丰富。  相似文献   

The probability hypothesis density(PHD)flter provides an efciently parallel processing method for multi-target tracking.However,measurements have to be gathered for a scan period before the PHD flter can perform a recursion,therefore,signifcant delay may arise if the scan period is long.To reduce the delay in the PHD flter,we propose a sequential PHD flter which updates the posterior intensity whenever a new measurement becomes available.An implementation of the sequential PHD flter for a linear Gaussian system is also developed.The unique characteristic of the proposed flter is that it can retain the useful information of missed targets in the posterior intensity and sequentially handle the received measurements in time.  相似文献   

The joint segmentation of multiple series is considered. A mixed linear model is used to account for both covariates and correlations between signals. An estimation algorithm based on EM which involves a new dynamic programming strategy for the segmentation step is proposed. The computational efficiency of this procedure is shown and its performance is assessed through simulation experiments. Applications are presented in the field of climatic data analysis.  相似文献   


在随机有限集框架下提出了当杂波和漏检存在时基于最优子模式分配距离的多目标联合检测与估计(JDE) 误差界. 此处的JDE 是指同时估计目标个数和存活目标状态. 算例1 展示了该误差界随传感器检测概率和杂波密度的变化趋势; 算例2 利用多假设跟踪, 概率假设密度(PHD) 和势PHD 滤波器对该误差界的有效性进行了验证.


This article is concerned with the approximation of the distributional behaviour of linear, time-invariant (LTI) systems. First, we review the different types of approximations of distributions by smooth functions and explain their significance in characterising system properties. Second, we consider the problem of changing the state of controllable LTI differential systems in a very short time. Thus, we establish an interesting relation between the time and volatility parameters of the Gaussian function and its derivatives in the approximation of distributional solutions. An algorithm is then proposed for calculating the distributional input and its smooth approximation which minimises the distance to an arbitrary target state. The optimal choice of the volatility parameter for the state transition is also derived. Finally, some complementary distance problems are also considered. The main results of this article are illustrated by numerous examples.  相似文献   

A novel robust integral linear quadratic Gaussian (ILQG) controller is presented in this paper to control the voltage of islanded microgrid and improves its transient response. Microgrid is a small grid that consists of number of distributed generator units, power‐electronic components with inductor‐capacitor (LC) filters and loads. The loads are parametrically uncertain and unknown that produces the voltage or power oscillation. The ILQG controller is capable to compensate for the voltage oscillation and exhibits the tracking of grid voltage against the different load dynamics. The design of ILQG controller is carried out by augmenting the plant dynamics with an integrator. The robustness of the ILQG controller is studied by considering a number of uncertainties within the plant model. The performance of ILQG controller is compared with linear quadratic regulator (LQR) and linear quadratic Gaussian (LQG) controller in terms of rise time, settling time, bandwidth and tracking error. The comparison results ensure the high bandwidth and tracking performance of ILQG controller as compared to other controllers.  相似文献   

Image enhancement using a human visual system model   总被引:2,自引:0,他引:2  
In this paper we report the result of a set of computer experiments carried out to enhance digital images. We use a special line weight function (LWF) which is a combination of zero- and second-order Hermite functions. We are motivated by the physiological evidence reported in R. A. Young, Spatial Vision 2(4), 273–293 (1987), that visual receptive fields are shaped like the sum of a Gaussian function and its Laplacian. This function can also be derived mathematically when the contrast sensitivity experiments in psychophysics are posed as an eigenvalue problem (A. L. Stewart and R. Pinkham, Biol. Cybernetics 64, 373–379 (1991). Analyses of the edge location error show that the proposed function has extremely good localization capability (i.e. the points marked by the operator is as close as possible to the center of the true edge). We also show that the LWF does not detect phantom edges which do not correspond to significant image intensity changes.  相似文献   

In this paper, we propose a model order reduction (MOR) method based on general orthogonal polynomials for K-power bilinear systems in the time domain. Constructing proper projection matrices by solving a series of linear equations, a reduced K-power bilinear system is produced, which preserves the original coupled structure. It can match several expansion coefficients of the original output. Then the error bound of our algorithm is also investigated. Moreover, the stability of the reduced system is discussed as well. Finally, two numerical examples are provided to illustrate the effectiveness of our algorithm.  相似文献   

对复Gauss ian小波满足Mercy条件及其在Hilbert空间具有再生性的命题作了证明.用复Gaussian小波构建出一种核函数,与主成分分析方法相结合,对非线性非平稳信号进行参数辨识和预测.针对多参数模型优化时间过长,不利于工程应用的问题,提出了一种多参数同步优化策略.仿真实验验证了该方法的可行性和有效性,表明该方法具有较好的实用价值.  相似文献   

A novel approach to characterise the model prediction errors using a Gaussian mixture model is proposed. The motivation for this work lies behind many data models that are developed through prediction error minimisation with the assumption of a normal noise distribution. When the noise is non-normal, which may often be the case in complicated data modelling scenarios, the model prediction errors may contain rich information, which can be further exploited for model refinement and improvement. The key contents presented in this paper include: choosing the relevant variables to form the error data, optimising the number of Gaussian components required for the error data modelling, and fitting the Gaussian mixture parameters using an expectation-maximisation algorithm. Application of the proposed method for further model improvement, within the framework of hybrid deterministic/stochastic modelling, is also discussed. Preliminary results on the real industrial Charpy impact energy data for heat-treated steels show its effectiveness for model error characterisation, and the potential for model performance improvement in terms of prediction accuracy as well as providing accurate prediction confidence intervals.  相似文献   

This paper presents the fractional-order Kalman filters using Tustin generating function for linear and nonlinear fractional-order systems involving process noise and measurement noise. By using the Tustin generating function, the differential equation model is obtained by discretising the investigated continuous-time fractional-order system. The two kinds of fractional-order Kalman filters are given for the correlated and uncorrelated cases in terms of the process noise and measurement noise for linear fractional-order system, respectively. In addition, based on the first-order Taylor expansion formula, the extended fractional-order Kalman filter using Tustin generating function is proposed to improve the accuracy of state estimation. Finally, three examples are illustrated to verify the effectiveness of the Tustion fractional-order Kalman filters for linear and nonlinear fractional-order systems.  相似文献   

Feature enhancement is an important preprocessing step in many image processing tasks. It is the process of adjusting image intensities so that the enhanced results are more suitable for analysis. Good enhancement results for linear structures such as vessels or neurites can be used as inputs for segmentation and other operations. In this paper, a novel linear feature enhancement filter – an adaptive multi-scale morpho-Gaussian filter – which can enhance and smooth linear features is proposed based on morphological operation, anisotropic Gaussian function and Hessian information. This filter can enhance and smooth along the local orientation of the linear structures and the Hessian measurement is used to further enhance the linear features. We utilize the Hessian matrix to calculate the orientation information for our directional morphological operation and the oriented anisotropic Gaussian smoothing. We also propose a novel method for junction enhancement, which can solve the problem of junction suppression. We decompose the junctions and enhance along each linear structure within a junction region. We present the test results of our algorithm on images of different types and compare our method with three existing methods. The experimental results show that the proposed approach can achieve better results.  相似文献   

《Journal of Process Control》2014,24(11):1647-1659
The problem of controlling a high-dimensional linear system subject to hard input and state constraints using model predictive control is considered. Applying model predictive control to high-dimensional systems typically leads to a prohibitive computational complexity. Therefore, reduced order models are employed in many applications. This introduces an approximation error which may deteriorate the closed loop behavior and may even lead to instability. We propose a novel model predictive control scheme using a reduced order model for prediction in combination with an error bounding system. We employ the explicit time and input dependent bound on the model order reduction error to achieve design conditions for constraint fulfillment, recursive feasibility and asymptotic stability for the closed loop of the model predictive controller when applied to the high-dimensional system. Moreover, for a special choice of design parameters, we establish local optimality of the proposed model predictive control scheme. The proposed MPC approach is assessed via examples demonstrating that a good trade-off between computational efficiency and conservatism can be achieved while guaranteeing constraint satisfaction and asymptotic stability.  相似文献   

It is undoubtedly important to be able to ensure the existence of a common quadratic Lyapunov function (CQLF) for a given switched system because this is proof of its asymptotic stability, but equally important is the ability to calculate it in order to obtain more specific information about the behaviour of the switched system under analysis. This article describes the development of a new methodology for calculating a CQLF based on particle swarm optimisation (PSO) once the existence of a CQLF has been assured. Several comparative analyses are presented to show the strengths and advantages of the proposed methodology.  相似文献   

Quality function deployment (QFD) is a product development process performed to maximize customer satisfaction. In the QFD, the design requirements (DRs) affecting the product performance are primarily identified, and product performance is improved to optimize customer needs (CNs). For product development, determining the fulfillment levels of design requirements (DRs) is crucial during QFD optimization. However, in real world applications, the values of DRs are often discrete instead of continuous. To the best of our knowledge, there is no mixed integer linear programming (MILP) model in which the discrete DRs values are considered. Therefore, in this paper, a new QFD optimization approach combining MILP model and Kano model is suggested to acquire the optimized solution from a limited number of alternative DRs, the values of which can be discrete. The proposed model can be used not only to optimize the product development but also in other applications of QFD such as quality management, planning, design, engineering and decision-making, on the condition that DR values are discrete. Additionally, the problem of lack of solutions in integer and linear programming in the QFD optimization is overcome. Finally, the model is illustrated through an example.  相似文献   

A Probabilistic Framework for SVM Regression and Error Bar Estimation   总被引:9,自引:0,他引:9  
In this paper, we elaborate on the well-known relationship between Gaussian Processes (GP) and Support Vector Machines (SVM) under some convex assumptions for the loss functions. This paper concentrates on the derivation of the evidence and error bar approximation for regression problems. An error bar formula is derived based on the -insensitive loss function.  相似文献   

In this paper, a numerical method based on based quintic B-spline has been developed to solve systems of the linear and nonlinear Fredholm and Volterra integral equations. The solutions are collocated by quintic B-splines and then the integral equations are approximated by the four-points Gauss-Turán quadrature formula with respect to the weight function Legendre. The quintic spline leads to optimal approximation and O(h6) global error estimates obtained for numerical solution. The error analysis of proposed numerical method is studied theoretically. The results are compared with the results obtained by other methods which show that our method is accurate.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号