首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
We present a comprehensive survey of robot Learning from Demonstration (LfD), a technique that develops policies from example state to action mappings. We introduce the LfD design choices in terms of demonstrator, problem space, policy derivation and performance, and contribute the foundations for a structure in which to categorize LfD research. Specifically, we analyze and categorize the multiple ways in which examples are gathered, ranging from teleoperation to imitation, as well as the various techniques for policy derivation, including matching functions, dynamics models and plans. To conclude we discuss LfD limitations and related promising areas for future research.  相似文献   

2.
The most difficult??and often most essential??aspect of many interception and tracking tasks is constructing motion models of the targets. Experts rarely can provide complete information about a target??s expected motion pattern, and fitting parameters for complex motion patterns can require large amounts of training data. Specifying how to parameterize complex motion patterns is in itself a difficult task. In contrast, Bayesian nonparametric models of target motion are very flexible and generalize well with relatively little training data. We propose modeling target motion patterns as a mixture of Gaussian processes (GP) with a Dirichlet process (DP) prior over mixture weights. The GP provides an adaptive representation for each individual motion pattern, while the DP prior allows us to represent an unknown number of motion patterns. Both automatically adjust the complexity of the motion model based on the available data. Our approach outperforms several parametric models on a helicopter-based car-tracking task on data collected from the greater Boston area.  相似文献   

3.
The paper achieves two outcomes. First, it summarizes previous work on concurrent Markov decision processes (CMDPs) currently demonstrated for use with multi-agent foraging problems. When using CMDPs, each agent models the environment using two Markov decision process (MDP). The two MDPs characterize a multi-agent foraging problem by modeling both a single-agent foraging problem, and multi-agent task allocation problem, for each agent. Second, the paper studies the effects of state uncertainty on a heterogeneous robot team that utilizes the aforementioned CMDP modelling approach. Furthermore, the paper presents a method to maintain performance despite state uncertainty. The resulting robust concurrent individual and social learning (RCISL) mechanism leads to an enhanced team learning behaviour despite state uncertainty. The paper analyzes the performance of the concurrent individual and social learning mechanism with and without a particle filter for a heterogeneous foraging scenario. The RCISL mechanism confers statistically significant performance improvements over the CISL mechanism.  相似文献   

4.
Robot learning by demonstration is key to bringing robots into daily social environments to interact with and learn from human and other agents. However, teaching a robot to acquire new knowledge is a tedious and repetitive process and often restrictive to a specific setup of the environment. We propose a template-based learning framework for robot learning by demonstration to address both generalisation and adaptability. This novel framework is based upon a one-shot learning model integrated with spectral clustering and an online learning model to learn and adapt actions in similar scenarios. A set of statistical experiments is used to benchmark the framework components and shows that this approach requires no extensive training for generalisation and can adapt to environmental changes flexibly. Two real-world applications of an iCub humanoid robot playing the tic-tac-toe game and soldering a circuit board are used to demonstrate the relative merits of the framework.  相似文献   

5.
为提高磁共振图像的重构质量,提出一种基于非参数贝叶斯分类字典学习的重建方法.通过差分变换,在梯度域中利用无限高斯混合模型将图像块自动聚类,对具有相似结构的图像块进行分类训练字典.采用非参数贝叶斯字典学习方法训练字典,克服传统字典学习对参数选择的依赖性.实验结果表明,与目前几种典型的磁共振图像重建方法相比,该方法的峰值信噪比平均提高2.9 dB;在同一噪声水平下,该方法抗噪性能更强,重构质量更优.  相似文献   

6.
Pattern Analysis and Applications - Developing effective machine learning methods for multimedia data modeling continues to challenge computer vision scientists. The capability of providing...  相似文献   

7.
为提高磁共振图像的重构质量,提出一种基于非参数贝叶斯分类字典学习的重建方法.通过差分变换,在梯度域中利用无限高斯混合模型将图像块自动聚类,对具有相似结构的图像块进行分类训练字典.采用非参数贝叶斯字典学习方法训练字典,克服传统字典学习对参数选择的依赖性.实验结果表明,与目前几种典型的磁共振图像重建方法相比,该方法的峰值信噪比平均提高2.9 dB;在同一噪声水平下,该方法抗噪性能更强,重构质量更优.  相似文献   

8.
In this paper, a multi-agent reinforcement learning method based on action prediction of other agent is proposed. In a multi-agent system, action selection of the learning agent is unavoidably impacted by other agents’ actions. Therefore, joint-state and joint-action are involved in the multi-agent reinforcement learning system. A novel agent action prediction method based on the probabilistic neural network (PNN) is proposed. PNN is used to predict the actions of other agents. Furthermore, the sharing policy mechanism is used to exchange the learning policy of multiple agents, the aim of which is to speed up the learning. Finally, the application of presented method to robot soccer is studied. Through learning, robot players can master the mapping policy from the state information to the action space. Moreover, multiple robots coordination and cooperation are well realized.  相似文献   

9.
We present a novel approach to estimating depth from single omnidirectional camera images by learning the relationship between visual features and range measurements available during a training phase. Our model not only yields the most likely distance to obstacles in all directions, but also the predictive uncertainties for these estimates. This information can be utilized by a mobile robot to build an occupancy grid map of the environment or to avoid obstacles during exploration—tasks that typically require dedicated proximity sensors such as laser range finders or sonars. We show in this paper how an omnidirectional camera can be used as an alternative to such range sensors. As the learning engine, we apply Gaussian processes, a nonparametric approach to function regression, as well as a recently developed extension for dealing with input-dependent noise. In practical experiments carried out in different indoor environments with a mobile robot equipped with an omnidirectional camera system, we demonstrate that our system is able to estimate range with an accuracy comparable to that of dedicated sensors based on sonar or infrared light.  相似文献   

10.
This paper proposes an end-to-end learning from demonstration framework for teaching force-based manipulation tasks to robots. The strengths of this work are manyfold. First, we deal with the problem of learning through force perceptions exclusively. Second, we propose to exploit haptic feedback both as a means for improving teacher demonstrations and as a human–robot interaction tool, establishing a bidirectional communication channel between the teacher and the robot, in contrast to the works using kinesthetic teaching. Third, we address the well-known what to imitate? problem from a different point of view, based on the mutual information between perceptions and actions. Lastly, the teacher’s demonstrations are encoded using a Hidden Markov Model, and the robot execution phase is developed by implementing a modified version of Gaussian Mixture Regression that uses implicit temporal information from the probabilistic model, needed when tackling tasks with ambiguous perceptions. Experimental results show that the robot is able to learn and reproduce two different manipulation tasks, with a performance comparable to the teacher’s one.  相似文献   

11.
In this paper, we present a novel methodology for preference learning based on the concept of inductive transfer. Specifically, we introduce a nonparametric hierarchical Bayesian multitask learning approach, based on the notion that human subjects may cluster together forming groups of individuals with similar preference rationale (but not identical preferences). Our approach is facilitated by the utilization of a Dirichlet process prior, which allows for the automatic inference of the most appropriate number of subject groups (clusters), as well as the employment of the automatic relevance determination (ARD) mechanism, giving rise to a sparse nature for our model, which significantly enhances its computational efficiency. We explore the efficacy of our novel approach by applying it to both a synthetic experiment and a real-world music recommendation application. As we show, our approach offers a significant enhancement in the effectiveness of knowledge transfer in statistical preference learning applications, being capable of correctly inferring the actual number of human subject groups in a modeled dataset, and limiting knowledge transfer only to subjects belonging to the same group (wherein knowledge transferability is more likely).  相似文献   

12.
13.
An approach to learning mobile robot navigation   总被引:1,自引:0,他引:1  
This paper describes an approach to learning an indoor robot navigation task through trial-and-error. A mobile robot, equipped with visual, ultrasonic and laser sensors, learns to servo to a designated target object. In less than ten minutes of operation time, the robot is able to navigate to a marked target object in an office environment. The central learning mechanism is the explanation-based neural network learning algorithm (EBNN). EBNN initially learns function purely inductively using neural network representations. With increasing experience, EBNN employs domain knowledge to explain and to analyze training data in order to generalize in a more knowledgeable way. Here EBNN is applied in the context of reinforcement learning, which allows the robot to learn control using dynamic programming.  相似文献   

14.
Journal of Intelligent Manufacturing - Robot learning from demonstration (LfD) emerges as a promising solution to transfer human motion to the robot. However, because of the open-loop between the...  相似文献   

15.
Object manipulation is a challenging task for robotics, as the physics involved in object interaction is complex and hard to express analytically. Here we introduce a modular approach for learning a manipulation strategy from human demonstration. Firstly we record a human performing a task that requires an adaptive control strategy in different conditions, i.e. different task contexts. We then perform modular decomposition of the control strategy, using phases of the recorded actions to guide segmentation. Each module represents a part of the strategy, encoded as a pair of forward and inverse models. All modules contribute to the final control policy; their recommendations are integrated via a system of weighting based on their own estimated error in the current task context. We validate our approach by demonstrating it, both in a simulation for clarity, and on a real robot platform to demonstrate robustness and capacity to generalise. The robot task is opening bottle caps. We show that our approach can modularize an adaptive control strategy and generate appropriate motor commands for the robot to accomplish the complete task, even for novel bottles.  相似文献   

16.
This paper describes a syntactic approach to imitation learning that captures important task structures in the form of probabilistic activity grammars from a reasonably small number of samples under noisy conditions. We show that these learned grammars can be recursively applied to help recognize unforeseen, more complicated tasks that share underlying structures. The grammars enforce an observation to be consistent with the previously observed behaviors which can correct unexpected, out-of-context actions due to errors of the observer and/or demonstrator. To achieve this goal, our method (1) actively searches for frequently occurring action symbols that are subsets of input samples to uncover the hierarchical structure of the demonstration, and (2) considers the uncertainties of input symbols due to imperfect low-level detectors.We evaluate the proposed method using both synthetic data and two sets of real-world humanoid robot experiments. In our Towers of Hanoi experiment, the robot learns the important constraints of the puzzle after observing demonstrators solving it. In our Dance Imitation experiment, the robot learns 3 types of dances from human demonstrations. The results suggest that under reasonable amount of noise, our method is capable of capturing the reusable task structures and generalizing them to cope with recursions.  相似文献   

17.
18.
In this paper we present a new method for Joint Feature Selection and Classifier Learning using a sparse Bayesian approach. These tasks are performed by optimizing a global loss function that includes a term associated with the empirical loss and another one representing a feature selection and regularization constraint on the parameters. To minimize this function we use a recently proposed technique, the Boosted Lasso algorithm, that follows the regularization path of the empirical risk associated with our loss function. We develop the algorithm for a well known non-parametrical classification method, the relevance vector machine, and perform experiments using a synthetic data set and three databases from the UCI Machine Learning Repository. The results show that our method is able to select the relevant features, increasing in some cases the classification accuracy when feature selection is performed.  相似文献   

19.
20.
A Bayesian nonparametric approach to modeling a nonlinear dynamic model is presented. New techniques for sampling infinite mixture models are used. The inference procedure specifically in the case of the logistic model and when the nonparametric component is applied to the additive errors is demonstrated.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号