首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 92 毫秒
1.
Locally weighted learning (LWL) is a class of techniques from nonparametric statistics that provides useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of robotic systems. This paper introduces several LWL algorithms that have been tested successfully in real-time learning of complex robot tasks. We discuss two major classes of LWL, memory-based LWL and purely incremental LWL that does not need to remember any data explicitly. In contrast to the traditional belief that LWL methods cannot work well in high-dimensional spaces, we provide new algorithms that have been tested on up to 90 dimensional learning problems. The applicability of our LWL algorithms is demonstrated in various robot learning examples, including the learning of devil-sticking, pole-balancing by a humanoid robot arm, and inverse-dynamics learning for a seven and a 30 degree-of-freedom robot. In all these examples, the application of our statistical neural networks techniques allowed either faster or more accurate acquisition of motor control than classical control engineering.  相似文献   

2.
深度强化学习中稀疏奖励问题研究综述   总被引:1,自引:0,他引:1  
强化学习作为机器学习的重要分支,是在与环境交互中寻找最优策略的一类方法。强化学习近年来与深度学习进行了广泛结合,形成了深度强化学习的研究领域。作为一种崭新的机器学习方法,深度强化学习同时具有感知复杂输入和求解最优策略的能力,可以应用于机器人控制等复杂决策问题。稀疏奖励问题是深度强化学习在解决任务中面临的核心问题,在实际应用中广泛存在。解决稀疏奖励问题有利于提升样本的利用效率,提高最优策略的水平,推动深度强化学习在实际任务中的广泛应用。文中首先对深度强化学习的核心算法进行阐述;然后介绍稀疏奖励问题的5种解决方案,包括奖励设计与学习、经验回放机制、探索与利用、多目标学习和辅助任务等;最后对相关研究工作进行总结和展望。  相似文献   

3.
Foster  David  Dayan  Peter 《Machine Learning》2002,49(2-3):325-346
Solving in an efficient manner many different optimal control tasks within the same underlying environment requires decomposing the environment into its computationally elemental fragments. We suggest how to find fragmentations using unsupervised, mixture model, learning methods on data derived from optimal value functions for multiple tasks, and show that these fragmentations are in accord with observable structure in the environments. Further, we present evidence that such fragments can be of use in a practical reinforcement learning context, by facilitating online, actor-critic learning of multiple goals MDPs.  相似文献   

4.
作为自动化和智能化时代的代表,机器人技术的发展成为智能控制领域研究的焦点,各种基于机器人的智能控制技术应运而生,机器人被越来越多地应用于实现与环境之间的复杂多接触交互任务.本文以机器人复杂多接触交互任务为核心问题展开讨论,结合基于强化学习的机器人智能体训练相关研究,对基于强化学习方法实现机器人多接触交互任务展开综述.概述了强化学习在机器人多接触任务研究中的代表性研究,当前研究中存在的问题以及改进多接触交互任务实验效果的优化方法,结合当前研究成果和各优化方法特点对未来机器人多接触交互任务的智能控制方法进行了展望.  相似文献   

5.
ABSTRACT

Motor-skill learning for complex robotic tasks is a challenging problem due to the high task variability. Robotic clothing assistance is one such challenging problem that can greatly improve the quality-of-life for the elderly and disabled. In this study, we propose a data-efficient representation to encode task-specific motor-skills of the robot using Bayesian nonparametric latent variable models. The effectivity of the proposed motor-skill representation is demonstrated in two ways: (1) through a real-time controller that can be used as a tool for learning from demonstration to impart novel skills to the robot and (2) by demonstrating that policy search reinforcement learning in such a task-specific latent space outperforms learning in the high-dimensional joint configuration space of the robot. We implement our proposed framework in a practical setting with a dual-arm robot performing clothing assistance tasks.  相似文献   

6.
Transfer of Learning by Composing Solutions of Elemental Sequential Tasks   总被引:2,自引:0,他引:2  
Although building sophisticated learning agents that operate in complex environments will require learning to perform multiple tasks, most applications of reinforcement learning have focused on single tasks. In this paper I consider a class of sequential decision tasks (SDTs), called composite sequential decision tasks, formed by temporally concatenating a number of elemental sequential decision tasks. Elemental SDTs cannot be decomposed into simpler SDTs. I consider a learning agent that has to learn to solve a set of elemental and composite SDTs. I assume that the structure of the composite tasks is unknown to the learning agent. The straightforward application of reinforcement learning to multiple tasks requires learning the tasks separately, which can waste computational resources, both memory and time. I present a new learning algorithm and a modular architecture that learns the decomposition of composite SDTs, and achieves transfer of learning by sharing the solutions of elemental SDTs across multiple composite SDTs. The solution of a composite SDT is constructed by computationally inexpensive modifications of the solutions of its constituent elemental SDTs. I provide a proof of one aspect of the learning algorithm.  相似文献   

7.
Flexible latent variable models for multi-task learning   总被引:1,自引:1,他引:0  
Given multiple prediction problems such as regression or classification, we are interested in a joint inference framework that can effectively share information between tasks to improve the prediction accuracy, especially when the number of training examples per problem is small. In this paper we propose a probabilistic framework which can support a set of latent variable models for different multi-task learning scenarios. We show that the framework is a generalization of standard learning methods for single prediction problems and it can effectively model the shared structure among different prediction tasks. Furthermore, we present efficient algorithms for the empirical Bayes method as well as point estimation. Our experiments on both simulated datasets and real world classification datasets show the effectiveness of the proposed models in two evaluation settings: a standard multi-task learning setting and a transfer learning setting.  相似文献   

8.
As a powerful tool for solving nonlinear complex system control problems, the model-free reinforcement learning hardly guarantees system stability in the early stage of learning, especially with high complicity learning components applied. In this paper, a reinforcement learning framework imitating many cognitive mechanisms of brain such as attention, competition, and integration is proposed to realize sample-efficient self-stabilized online learning control. Inspired by the generation of consciousness in human brain, multiple actors that work either competitively for best interaction results or cooperatively for more accurate modeling and predictions were applied. A deep reinforcement learning implementation for challenging control tasks and a real-time control implementation of the proposed framework are respectively given to demonstrate the high sample efficiency and the capability of maintaining system stability in the online learning process without requiring an initial admissible control.  相似文献   

9.
基于深度学习的三维模型分类方法大都面向特定的具体任务,在面向三维模型多样化分类任务时表现不佳,泛用性不足。为此,提出了一种通用的端到端的深度集成学习模型E2E-DEL(end-to-end deep ensemble learning),由多个初级学习器和一个集成学习器组成,可以自动学习复杂三维模型的复合特征信息;并使用层次迭代式学习策略,综合考量不同层次网络的特征学习能力,合理平衡各个初级学习器的子特征学习和集成学习器的集成特征学习效果,自适应于三维模型多样化分类任务。基于此,设计了一种面向多视图的深度集成学习网络MV-DEL(multi-view deep ensemble learning),应用于一般性、细粒度、零样本三种不同类型的三维模型分类任务中。在多个公开数据集上的实验验证了该方法具有良好的泛化性与普适性。  相似文献   

10.
基于深度学习的数字病理图像分割综述与展望   总被引:1,自引:0,他引:1  
宋杰  肖亮  练智超  蔡子贇  蒋国平 《软件学报》2021,32(5):1427-1460
数字病理图像分析对于乳腺癌、前列腺癌等良恶性分级诊断具有重要意义,其中组织基元的形态和目标测量是量化分析的重要依据.然而,由于病理数据多样性和复杂性等新特点,其分割任务面临着特征提取困难、实例分割困难等挑战.人工智能辅助病理量化分析,将复杂病理数据转化为可挖掘的图像特征,使得自动提取组织基元的定量化信息成为可能.特别是随着计算机计算能力的快速发展,深度学习技术凭借其强大的特征学习、设计灵活等特性在数字病理量化分析领域取得了突破性成果.本文系统概述目前代表性深度学习方法,包括卷积神经网络、全卷积网络、编码器—解码器模型、循环神经网络、生成对抗网络等方法体系,总结深度学习在病理图像分割等任务中的建模机理和应用,并梳理了现有方法的方法理论、关键技术、优缺点和性能分析.最后,本文讨论了未来数字病理图像分割深度学习建模的开放性挑战和新趋势.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号