首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
For many applications such as compliant, accurate robot tracking control, dynamics models learned from data can help to achieve both compliant control performance as well as high tracking quality. Online learning of these dynamics models allows the robot controller to adapt itself to changes in the dynamics (e.g., due to time-variant nonlinearities or unforeseen loads). However, online learning in real-time applications - as required in control - cannot be realized by straightforward usage of off-the-shelf machine learning methods such as Gaussian process regression or support vector regression. In this paper, we propose a framework for online, incremental sparsification with a fixed budget designed for fast real-time model learning. The proposed approach employs a sparsification method based on an independence measure. In combination with an incremental learning approach such as incremental Gaussian process regression, we obtain a model approximation method which is applicable in real-time online learning. It exhibits competitive learning accuracy when compared with standard regression techniques. Implementation on a real Barrett WAM robot demonstrates the applicability of the approach in real-time online model learning for real world systems.  相似文献   

2.
In this paper, we propose a method for modeling trajectory patterns with both regional and velocity observations through the probabilistic topic model. By embedding Gaussian models into the discrete topic model framework, our method uses continuous velocity as well as regional observations unlike existing approaches. In addition, the proposed framework combined with Hidden Markov Model can cover the temporal transition of the scene state, which is useful in checking a violation of the rule that some conflict topics (e.g. two cross-traffic patterns) should not occur at the same time. To achieve online learning even with the complexity of the proposed model, we suggest a novel learning scheme instead of collapsed Gibbs sampling. The proposed two-stage greedy learning scheme is not only efficient at reducing the search space but also accurate in a way that the accuracy of online learning becomes not worse than that of the batch learning. To validate the performance of our method, experiments were conducted on various datasets. Experimental results show that our model explains satisfactorily the trajectory patterns with respect to scene understanding, anomaly detection, and prediction.  相似文献   

3.
Real-time strategy (RTS) games provide a challenging platform to implement online reinforcement learning (RL) techniques in a real application. Computer, as one game player, monitors opponents’ (human or other computers) strategies and then updates its own policy using RL methods. In this article, we first examine the suitability of applying the online RL in various computer games. Reinforcement learning application depends on both RL complexity and the game features. We then propose a multi-layer framework for implementing online RL in an RTS game. The framework significantly reduces RL computational complexity by decomposing the state space in a hierarchical manner. We implement an RTS game—Tank General—and perform a thorough test on the proposed framework. We consider three typical profiles of RTS game players and compare two basic RL techniques applied in the game. The results show the effectiveness of our proposed framework and shed light on relevant issues in using online RL in RTS games.  相似文献   

4.
Friend recommendation plays a key role in promoting user experience in online social networks (OSNs). However, existing studies usually neglect users’ fine-grained interest as well as the evolving feature of interest, which may cause unsuitable recommendation. In particular, some OSNs, such as the online learning community, even have little work on friend recommendation. To this end, we strive to improve friend recommendation with fine-grained evolving interest in this paper. We take the online learning community as an application scenario, which is a special type of OSNs for people to learn courses online. Learning partners can help improve learners’ learning effect and improve the attractiveness of platforms. We propose a learning partner recommendation framework based on the evolution of fine-grained learning interest (LPRF-E for short). We extract a sequence of learning interest tags that changes over time. Then, we explore the time feature to predict evolving learning interest. Next, we recommend learning partners by fine-grained interest similarity. We also refine the learning partner recommendation framework with users’ social influence (denoted as LPRF-F for differentiation). Extensive experiments on two real datasets crawled from Chinese University MOOC and Douban Book validate that the proposed LPRF-E and LPRF-F models achieve a high accuracy (i.e., approximate 50% improvements on the precision and the recall) and can recommend learning partners with high quality (e.g., more experienced and helpful).  相似文献   

5.
An algorithm using the unsupervised Bayesian online learning process is proposed for the segmentation of object-based video images. The video image segmentation is solved using a classification method. First, different visual features (the spatial location, colour and optical-flow vectors) are fused in a probability framework for image pixel clustering. The appropriate modelling of the probability distribution function (PDF) for each feature-cluster is obtained through a Gaussian distribution. The image pixel is then assigned a cluster number in a maximum a posteriori probability framework. Different from the previous segmentation methods, the unsupervised Bayesian online learning algorithm has been developed to understand a cluster's PDF parameters through the image sequence. This online learning process uses the pixels of the previous clustered image and information from the feature-cluster to update the PDF parameters for segmentation of the current image. The unsupervised Bayesian online learning algorithm has shown satisfactory experimental results on different video sequences.  相似文献   

6.
We propose a model-based learning algorithm, the Adaptive-resolution Reinforcement Learning (ARL) algorithm, that aims to solve the online, continuous state space reinforcement learning problem in a deterministic domain. Our goal is to combine adaptive-resolution approximation schemes with efficient exploration in order to obtain polynomial learning rates. The proposed algorithm uses an adaptive approximation of the optimal value function using kernel-based averaging, going from coarse to fine kernel-based representation of the state space, which enables us to use finer resolution in the “important” areas of the state space, and coarser resolution elsewhere. We consider an online learning approach, in which we discover these important areas online, using an uncertainty intervals exploration technique. In addition, we introduce an incremental variant of the ARL (IARL), which is a more practical version of the original algorithm with reduced computational complexity at each stage. Polynomial learning rates in terms of mistake bound (in a PAC framework) are established for these algorithms, under appropriate continuity assumptions.  相似文献   

7.
Existing Takagi-Sugeno-Kang (TSK) fuzzy models proposed in the literature attempt to optimize the global learning accuracy as well as to maintain the interpretability of the local models. Most of the proposed methods suffer from the use of offline learning algorithms to globally optimize this multi-criteria problem. Despite the ability to reach an optimal solution in terms of accuracy and interpretability, these offline methods are not suitably applicable to learning in adaptive or incremental systems. Furthermore, most of the learning methods in TSK-model are susceptible to the limitation of the curse-of-dimensionality. This paper attempts to study the criteria in the design of TSK-models. They are: 1) the interpretability of the local model; 2) the global accuracy; and 3) the system dimensionality issues. A generic framework is proposed to handle the different scenarios in this design problem. The framework is termed the generic fuzzy input Takagi-Sugeno-Kang fuzzy framework (FITSK). The FITSK framework is extensible to both the zero-order and the first-order FITSK models. A zero-order FITSK model is suitable for the learning of adaptive system, and the bias-variance of the system can be easily controlled through the degree of localization. On the other hand, a first-order FITSK model is able to achieve higher learning accuracy for nonlinear system estimation. A localized version of recursive least-squares algorithm is proposed for the parameter tuning of the first-order FITSK model. The local recursive least-squares is able to achieve a balance between interpretability and learning accuracy of a system, and possesses greater immunity to the curse-of-dimensionality. The learning algorithms for the FITSK models are online, and are readily applicable to adaptive system with fast convergence speed. Finally, a proposed guideline is discussed to handle the model selection of different FITSK models to tackle the multi-criteria design problem of applying the TSK-model. Extensive simulations were conducted using the proposed FITSK models and their learning algorithms; their performances are encouraging when benchmarked against other popular fuzzy systems.  相似文献   

8.
The literature on English for academic purposes (EAP) methodology highlights the significance of learners' engagement in learning language (Hyland, 2006) in mainstream general and online contexts. Blogs have been recommended in many studies as having the potential to bring the sense of community and collaboration in online classes. Therefore, this study sought to investigate whether blogs in large classes would help students enhance their perceptions of learning. To this end, Forty-two undergraduate students of Information Technology (IT) at an Iranian university participated in a weblog writing course in order to promote collaboration and reflective learning. Instrumentation included a questionnaire of perceived learning and sense of community, semi-structured interviews, and participant observations. The findings revealed a significant difference in perceived learning between the students with low sense of community and those with a high sense of community. Based on the qualitative findings of the study, we suggest an assessment framework incorporating constructivist and social-interactionist theories of learning in order to treat students as members of a community of learning. The findings may promise implications for gearing EAP assessment to more collaborative modes in online courses and suggest a model framework for the assessment of students in EAP online classes.  相似文献   

9.
《Advanced Robotics》2013,27(15):2015-2034
Precise models of robot inverse dynamics allow the design of significantly more accurate, energy-efficient and compliant robot control. However, in some cases the accuracy of rigid-body models does not suffice for sound control performance due to unmodeled nonlinearities arising from hydraulic cable dynamics, complex friction or actuator dynamics. In such cases, estimating the inverse dynamics model from measured data poses an interesting alternative. Nonparametric regression methods, such as Gaussian process regression (GPR) or locally weighted projection regression (LWPR), are not as restrictive as parametric models and, thus, offer a more flexible framework for approximating unknown nonlinearities. In this paper, we propose a local approximation to the standard GPR, called local GPR (LGP), for real-time model online learning by combining the strengths of both regression methods, i.e., the high accuracy of GPR and the fast speed of LWPR. The approach is shown to have competitive learning performance for high-dimensional data while being sufficiently fast for real-time learning. The effectiveness of LGP is exhibited by a comparison with the state-of-the-art regression techniques, such as GPR, LWPR and ν-support vector regression. The applicability of the proposed LGP method is demonstrated by real-time online learning of the inverse dynamics model for robot model-based control on a Barrett WAM robot arm.  相似文献   

10.
锂离子电池是一个复杂的电化学动态系统,实时准确的健康状态(SOH)估计对电动汽车动力锂电池的维护至关重要,传统建模方法难以实现SOH的在线估算.基于此,从实时评估电池的SOH出发,在增量学习的基础上,选取与电池健康状态相关的指标建立SOH预测模型.考虑到增量学习中的耗时性问题,提出融合滑动窗口技术的HI-DD算法,该算法可以检测概念漂移是否发生,从而指导和确定模型更新位置;设计出HI-DD与AdaBoost.RT结合的模型更新策略,进而提高模型的在线学习性能和预测精度,最后使用CALCE提供的电池老化实验数据对所提出的方法进行验证.结果表明,基于增量学习的HI-DD-AdaBoost.RT预测算法具有较强的在线更新能力和较高的预测精度,能够满足SOH在线预测的实际需求.  相似文献   

11.
基于即时学习的MIMO系统滑模预测控制方法   总被引:1,自引:0,他引:1  
针对MIMO非线性系统的控制问题,采用数据驱动的控制策略,将具有本质自适应能力的即时学习算法与具有强鲁棒性的滑模预测控制相结合,设计了一种基于即时学习的滑模预测(LL-SMPC)控制方法.该方法在在线局部建模的基础上,采用滑模预测控制律求取最优控制量,具有较强的自适应和抗干扰能力,并避免TDiophantine方程的求解,有效减少了计算量.通过仿真研究,验证了算法的有效性.  相似文献   

12.
As a powerful tool for solving nonlinear complex system control problems, the model-free reinforcement learning hardly guarantees system stability in the early stage of learning, especially with high complicity learning components applied. In this paper, a reinforcement learning framework imitating many cognitive mechanisms of brain such as attention, competition, and integration is proposed to realize sample-efficient self-stabilized online learning control. Inspired by the generation of consciousness in human brain, multiple actors that work either competitively for best interaction results or cooperatively for more accurate modeling and predictions were applied. A deep reinforcement learning implementation for challenging control tasks and a real-time control implementation of the proposed framework are respectively given to demonstrate the high sample efficiency and the capability of maintaining system stability in the online learning process without requiring an initial admissible control.  相似文献   

13.
The classification algorithm extreme SVM (ESVM) proposed recently has been proved to provide very good generalization performance in relatively short time, however, it is inappropriate to deal with large-scale data set due to the highly intensive computation. Thus we propose to implement an efficient parallel ESVM (PESVM) based on the current and powerful parallel programming framework MapReduce. Furthermore, we investigate that for some new coming training data, it is brutal for ESVM to always retrain a new model on all training data (including old and new coming data). Along this line, we develop an incremental learning algorithm for ESVM (IESVM), which can meet the requirement of online learning to update the existing model. Following that we also provide the parallel version of IESVM (PIESVM), which can solve both the large-scale problem and the online problem at the same time. The experimental results show that the proposed parallel algorithms not only can tackle large-scale data set, but also scale well in terms of the evaluation metrics of speedup, sizeup and scaleup. It is also worth to mention that PESVM, IESVM and PIESVM are much more efficient than ESVM, while the same solutions as ESVM are exactly obtained.  相似文献   

14.
In this paper, we propose a novel online framework for behavior understanding, in visual workflows, capable of achieving high recognition rates in real-time. To effect online recognition, we propose a methodology that employs a Bayesian filter supported by hidden Markov models. We also introduce a novel re-adjustment framework of behavior recognition and classification by incorporating the user’s feedback into the learning process through two proposed schemes: a plain non-linear one and a more sophisticated recursive one. The proposed approach aims at dynamically correcting erroneous classification results to enhance the behavior modeling and therefore the overall classification rates. The performance is thoroughly evaluated under real-life complex visual behavior understanding scenarios in an industrial plant. The obtained results are compared and discussed.  相似文献   

15.
目的 传统的多示例学习跟踪在跟踪过程中使用了自学习过程,当目标跟踪失败时分类器很容易退化。针对这个问题,提出一种基于在线特征选取的多示例学习跟踪方法(MILOFS)。方法 首先,该文使用稀疏随机矩阵来简化视频跟踪中图像特征的构建,使用随机矩阵投影来自高维度的图像信息。然后,利用Fisher线性判别模型构建包模型的损失函数,依照示例响应值直接在示例水平构建分类器的判别模型。最后,从梯度下降角度看待在线增强模型,使用梯度增强法来构建分类器的选取模型。结果 对不同场景的图像序列进行对比实验,实验结果中在线自适应增强(OAB)、在线多实例学习跟踪(MILTrack)、加权多实例学习跟踪(WMIL)、在线特征选取多实例学习跟踪(MILOFS)的平均跟踪误差分别为36像素、23像素、24像素、13像素,本文算法在光照变化、发生遮挡,以及形变的情况下都能准确跟踪目标,且具有很高的实时性。结论 基于在线特征选取的多示例学习跟踪,跟踪过程使用梯度增强法并直接在示例水平构建包模型的判别模型,可以有效克服传统多示例学习中的分类器退化问题。  相似文献   

16.
The proliferation of networked data in various disciplines motivates a surge of research interests on network or graph mining. Among them, node classification is a typical learning task that focuses on exploiting the node interactions to infer the missing labels of unlabeled nodes in the network. A vast majority of existing node classification algorithms overwhelmingly focus on static networks and they assume the whole network structure is readily available before performing learning algorithms. However, it is not the case in many real-world scenarios where new nodes and new links are continuously being added in the network. Considering the streaming nature of networks, we study how to perform online node classification on this kind of streaming networks (a.k.a. online learning on streaming networks). As the existence of noisy links may negatively affect the node classification performance, we first present an online network embedding algorithm to alleviate this problem by obtaining the embedding representation of new nodes on the fly. Then we feed the learned embedding representation into a novel online soft margin kernel learning algorithm to predict the node labels in a sequential manner. Theoretical analysis is presented to show the superiority of the proposed framework of online learning on streaming networks (OLSN). Extensive experiments on real-world networks further demonstrate the effectiveness and efficiency of the proposed OLSN framework.  相似文献   

17.
During the last decade, the development of the immersive virtual reality (VR) has achieved a great progress in different application areas. For more advanced large-scale immersive VR environments or systems, one of the most challenge is to accurately track the position of the user’s body part such as head when he/she is immersived in the environment to feel the changes among the synthetic stereoscopic image sequences. Unfortunately, accurate tracking is not easy in the virtual reality scenarios due to the variety types of existing intrinsic and extrinsic changes when tracking is on-the-fly. Especially for the single tracker, a long time accurate tracking is usually not possible because of the model adaption problem in different environments. Recent trend of research in tracking is to incorporate multiple trackers into a compositive learning framework and utilize the advantages of different trackers for more effective tracking. Therefore, in this paper, we propose a novel Bayesian tracking fusion framework with online classifier ensemble strategy. The proposed tracking formulates a fusion framework for online learning of multiple trackers by modeling a cumulative loss minimization process. With an optimal pair-wise sampling scheme for the SVM classifier, the proposed fusion framework can achieve more accurate tracking performance when compared with the other state-of-art trackers. In addition, the experiments on the standard benchmark database also verify that the proposed tracking is able to handle the challenges in many immersive VR applications and environments.  相似文献   

18.
This study extends the community of inquiry (CoI) framework and self-regulated learning (SRL) theory through an exploration of the structural relationships among existing CoI variables, learning presence (i.e., self-efficacy and online SRL strategy) and learning outcomes in the context of K-12 online learning. To help understand the influence of K-12 mentoring – which is unique to online learning in the U.S. – mentor presence is also included. Structural equation modelling of 696 online 8th through 12th graders' survey responses and final grades showed that adding learning presence to the CoI framework helped to explain how these learners translated their online-learning perceptions into cognitive and affective learning outcomes. We also found that mentor presence significantly and positively predicted online SRL strategy, one of the two components of learning presence. Lastly, we established a connection between the CoI model and various types of learning outcomes that are indicators of K-12 online learning success – though it should be noted that important differences existed between a model based on final grades and two other outcome models. It is hoped that the processes identified in this study will be useful and relevant to K-12 online-learning institutions and educators seeking to improve their offering via a wide range of approaches.  相似文献   

19.
Likas A 《Neural computation》1999,11(8):1915-1932
A general technique is proposed for embedding online clustering algorithms based on competitive learning in a reinforcement learning framework. The basic idea is that the clustering system can be viewed as a reinforcement learning system that learns through reinforcements to follow the clustering strategy we wish to implement. In this sense, the reinforcement guided competitive learning (RGCL) algorithm is proposed that constitutes a reinforcement-based adaptation of learning vector quantization (LVQ) with enhanced clustering capabilities. In addition, we suggest extensions of RGCL and LVQ that are characterized by the property of sustained exploration and significantly improve the performance of those algorithms, as indicated by experimental tests on well-known data sets.  相似文献   

20.
Monte-Carlo tree search for Bayesian reinforcement learning   总被引:2,自引:2,他引:0  
Bayesian model-based reinforcement learning can be formulated as a partially observable Markov decision process (POMDP) to provide a principled framework for optimally balancing exploitation and exploration. Then, a POMDP solver can be used to solve the problem. If the prior distribution over the environment’s dynamics is a product of Dirichlet distributions, the POMDP’s optimal value function can be represented using a set of multivariate polynomials. Unfortunately, the size of the polynomials grows exponentially with the problem horizon. In this paper, we examine the use of an online Monte-Carlo tree search (MCTS) algorithm for large POMDPs, to solve the Bayesian reinforcement learning problem online. We will show that such an algorithm successfully searches for a near-optimal policy. In addition, we examine the use of a parameter tying method to keep the model search space small, and propose the use of nested mixture of tied models to increase robustness of the method when our prior information does not allow us to specify the structure of tied models exactly. Experiments show that the proposed methods substantially improve scalability of current Bayesian reinforcement learning methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号