共查询到20条相似文献,搜索用时 0 毫秒
1.
基于模型的强化学习通过学习一个环境模型和基于此模型的策略优化或规划,实现机器人更接近于人类的学习和交互方式.文中简述机器人学习问题的定义,介绍机器人学习中基于模型的强化学习方法,包括主流的模型学习及模型利用的方法.主流的模型学习方法具体介绍前向动力学模型、逆向动力学模型和隐式模型.模型利用的方法具体介绍基于模型的规划、... 相似文献
2.
文章在简单概述强化学习理论的基础上,对强化学习在实际机器人应用中经常遇到的连续状态-动作空间、信度分配、探索和利用的平衡、不完整信息等关键性问题进行了讨论,给出了一些常用的解决方法,以期为相关的研究和应用提供一个参考。 相似文献
3.
《模式识别与人工智能》2005,18(5)
强化学习通过试错与环境交互获得策略的改进,其自学习和在线学习的特点使其成为机器学习研究的一个重要分支.但是,强化学习一直被"维数灾"问题所困扰.近年来,分层强化学习方法引入抽象(Abstraction)机制,在克服"维数灾"方面取得了显著进展.作为理论基础,本文首先介绍了强化学习的基本原理及基于半马氏过程的Q-学习算法.然后介绍了3种典型的单Agent分层强化学习方法(Option、HAM和MAXQ)的基本思想,Q-学习更新公式,概括了各方法的本质特征,并对这3种方法进行了对比分析评价.最后指出了将单Agent分层强化学习方法拓展到多Agent分层强化学习时需要解决的问题. 相似文献
4.
Haiyang Chao Yu Gu Marcello Napolitano 《Journal of Intelligent and Robotic Systems》2014,73(1-4):361-372
Optical flow has been widely used by insects and birds to support navigation functions. Such information has appealing capabilities for application to ground and aerial robots, especially for navigation and collision avoidance in urban or indoor areas. The purpose of this paper is to provide a survey of existing optical flow techniques for robotics navigation applications. Detailed comparisons are made among different optical-flow-aided navigation solutions with emphasis on the sensor hardware as well as optical flow motion models. A summary of current research status and future research directions are further discussed. 相似文献
5.
Cost-Sensitive Learning of Classification Knowledge and Its Applications in Robotics 总被引:7,自引:1,他引:7
Traditional learning-from-examples methods assume that examples are given beforehand and all features are measured for each example. However, in many robotic domains the number of features that could be measured is very large, the cost of measuring those features is significant, and thus the robot must judiciously select which features it will measure. Finding a proper tradeoff between theaccuracy (e.g., number of prediction errors) andefficiency (e.g., cost of measuring features) during learning (prior to convergence) is an important part of the problem. Inspired by such robotic domains, this article considers realistic measurement costs of features in the process of incremental learning of classification knowledge. It proposes a unified framework for learning-from-examples methods that trade off accuracy for efficiency during learning, and analyzes two methods (CS-ID3 and CS-IBL) in detail. Moreover, this article illustrates the application of such a cost-sensitive-learning method to a real robot designed for anapproach-recognize task. The resulting robot learns to approach, recognize, and grasp objects on a floor effectively and efficiently. Experimental results show that highly accurate classification procedures can be learned without sacrificing efficiency in the case of both synthetic and real domains. 相似文献
6.
7.
李洋 《计算机与数字工程》2010,38(5):78-80,174
教学的个性化和智能化是智能教学系统研究的重点和难点。文章采用智能代理技术模拟系统中学生的智能和行为方式,将强化学习理论应用于多代理体,设计了结合资格迹理论的强化学习算法,并用以生成和调整适合于每个学生个体的教学内容和教学策略。多代理体技术实现了教学的个性化,强化学习算法使得教学策略具有智能化。实验结果表明,新的算法较原有算法更为有效。 相似文献
8.
9.
随机博弈框架下的多agent强化学习方法综述 总被引:4,自引:0,他引:4
多agent学习是在随机博弈的框架下,研究多个智能体间通过自学习掌握交互技巧的问题.单agent强化学习方法研究的成功,对策论本身牢固的数学基础以及在复杂任务环境中广阔的应用前景,使得多agent强化学习成为目前机器学习研究领域的一个重要课题.首先介绍了多agent系统随机博弈中基本概念的形式定义;然后介绍了随机博弈和重复博弈中学习算法的研究以及其他相关工作;最后结合近年来的发展,综述了多agent学习在电子商务、机器人以及军事等方面的应用研究,并介绍了仍存在的问题和未来的研究方向. 相似文献
10.
《国际自动化与计算杂志》2024,21(3)
With the breakthrough of AlphaGo,deep reinforcement learning has become a recognized technique for solving sequential decision-making problems.Despite its reputation,data inefficiency caused by its trial and error learning mechanism makes deep rein-forcement learning difficult to apply in a wide range of areas.Many methods have been developed for sample efficient deep reinforce-ment learning,such as environment modelling,experience transfer,and distributed modifications,among which distributed deep rein-forcement learning has shown its potential in various applications,such as human-computer gaming and intelligent transportation.In this paper,we conclude the state of this exciting field,by comparing the classical distributed deep reinforcement learning methods and studying important components to achieve efficient distributed learning,covering single player single agent distributed deep reinforce-ment learning to the most complex multiple players multiple agents distributed deep reinforcement learning.Furthermore,we review re-cently released toolboxes that help to realize distributed deep reinforcement learning without many modifications of their non-distrib-uted versions.By analysing their strengths and weaknesses,a multi-player multi-agent distributed deep reinforcement learning toolbox is developed and released,which is further validated on Wargame,a complex environment,showing the usability of the proposed tool-box for multiple players and multiple agents distributed deep reinforcement learning under complex games.Finally,we try to point out challenges and future trends,hoping that this brief review can provide a guide or a spark for researchers who are interested in distrib-uted deep reinforcement learning. 相似文献
11.
Tremendous amount of data are being generated and saved in many complex engineering and social systems every day.It is significant and feasible to utilize the big data to make better decisions by machine learning techniques. In this paper, we focus on batch reinforcement learning(RL) algorithms for discounted Markov decision processes(MDPs) with large discrete or continuous state spaces, aiming to learn the best possible policy given a fixed amount of training data. The batch RL algorithms with handcrafted feature representations work well for low-dimensional MDPs. However, for many real-world RL tasks which often involve high-dimensional state spaces, it is difficult and even infeasible to use feature engineering methods to design features for value function approximation. To cope with high-dimensional RL problems, the desire to obtain data-driven features has led to a lot of works in incorporating feature selection and feature learning into traditional batch RL algorithms. In this paper, we provide a comprehensive survey on automatic feature selection and unsupervised feature learning for high-dimensional batch RL. Moreover, we present recent theoretical developments on applying statistical learning to establish finite-sample error bounds for batch RL algorithms based on weighted Lpnorms. Finally, we derive some future directions in the research of RL algorithms, theories and applications. 相似文献
12.
Mohsen Attaran 《Information Systems Management》1990,7(1):14-21
Changes in national and international competition, coupled with the mass application of new technologies, are revolutionizing production. Consumers are demanding quality, and manufacturers are being forced to provide it. Companies that wish to remain competitive and maintain their market share are turning to such advanced technologies as computer-aided industrial robots. This article discusses the benefits and justification of the industrial robot and provides managers in manufacturing firms with an overview of its potential applications. 相似文献
13.
14.
15.
16.
17.
《Parallel and Distributed Systems, IEEE Transactions on》2006,17(12):1512-1525
Many modern networked applications require specific levels of service quality from the underlying network. Moreover, next-generation networked applications are expected to adapt to changes in the underlying network, services, and user interactions. While some applications have built-in adaptivity, the adaptation itself requires specification of a system model. This paper presents Sapphire, an experimental approach for systematic model generation for application adaptation within a target network. It employs a nearly-automated, statistical design of experiments to characterize the relationships of both application and network-level parameters. First, it applies the Analysis of Variance (ANOVA) method to identify the most significant parameters and their interactions that affect performance. Next, it generates a model of application performance with respect to these parameters within the ranges of measurements. The key benefit of the framework is the integration of several well-established concepts of statistical modeling and distributed systems in the form of simple APIs so that existing applications can take advantage of it. We demonstrate the usefulness and flexibility of Sapphire by generating a performance model of an audio streaming application. We show that many existing multimedia and QoS-sensitive applications can exploit a statistical modeling approach such as Sapphire to incorporate application adaptivity. The approach can also be used for feedback control of distributed applications, tuning network and application parameters to achieve service levels in a target network. 相似文献
18.
In executing classical plans in the real world, small discrepancies between a planner's internal representations and the real world are unavoidable. These can conspire to cause real-world failures even though the planner is sound and, therefore, proves that a sequence of actions achieves the goal. Permissive planning, a machine learning extension to classical planning, is one response to this difficulty. This paper describes the permissive planning approach and presents GRASPER, a permissive planning robotic system that learns to robustly pick up novel objects. 相似文献
19.
学习、交互及其结合是建立健壮、自治agent的关键必需能力。强化学习是agent学习的重要部分,agent强化学习包括单agent强化学习和多agent强化学习。文章对单agent强化学习与多agent强化学习进行了比较研究,从基本概念、环境框架、学习目标、学习算法等方面进行了对比分析,指出了它们的区别和联系,并讨论了它们所面临的一些开放性的问题。 相似文献
20.
《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2008,38(5):1207-1220