期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

孙世光兰旭光张翰博郑南宁《模式识别与人工智能》2022,35(1):1-16

基于模型的强化学习通过学习一个环境模型和基于此模型的策略优化或规划,实现机器人更接近于人类的学习和交互方式.文中简述机器人学习问题的定义,介绍机器人学习中基于模型的强化学习方法,包括主流的模型学习及模型利用的方法.主流的模型学习方法具体介绍前向动力学模型、逆向动力学模型和隐式模型.模型利用的方法具体介绍基于模型的规划、... 相似文献

2.

强化学习理论在机器人应用中的几个关键问题探讨

张文志吕恬生《计算机工程与应用》2004,40(4):69-71,96

文章在简单概述强化学习理论的基础上,对强化学习在实际机器人应用中经常遇到的连续状态-动作空间、信度分配、探索和利用的平衡、不完整信息等关键性问题进行了讨论,给出了一些常用的解决方法,以期为相关的研究和应用提供一个参考。相似文献

3.

分层强化学习研究综述

《模式识别与人工智能》2005,18(5)

强化学习通过试错与环境交互获得策略的改进,其自学习和在线学习的特点使其成为机器学习研究的一个重要分支.但是,强化学习一直被"维数灾"问题所困扰.近年来,分层强化学习方法引入抽象(Abstraction)机制,在克服"维数灾"方面取得了显著进展.作为理论基础,本文首先介绍了强化学习的基本原理及基于半马氏过程的Q-学习算法.然后介绍了3种典型的单Agent分层强化学习方法(Option、HAM和MAXQ)的基本思想,Q-学习更新公式,概括了各方法的本质特征,并对这3种方法进行了对比分析评价.最后指出了将单Agent分层强化学习方法拓展到多Agent分层强化学习时需要解决的问题. 相似文献

4.

A Survey of Optical Flow Techniques for Robotics Navigation Applications

Haiyang Chao Yu Gu Marcello Napolitano 《Journal of Intelligent and Robotic Systems》2014,73(1-4):361-372

Optical flow has been widely used by insects and birds to support navigation functions. Such information has appealing capabilities for application to ground and aerial robots, especially for navigation and collision avoidance in urban or indoor areas. The purpose of this paper is to provide a survey of existing optical flow techniques for robotics navigation applications. Detailed comparisons are made among different optical-flow-aided navigation solutions with emphasis on the sensor hardware as well as optical flow motion models. A summary of current research status and future research directions are further discussed. 相似文献

5.

Cost-Sensitive Learning of Classification Knowledge and Its Applications in Robotics 总被引：7，自引：1，他引：7

Tan Ming 《Machine Learning》1993,13(1):7-33

Traditional learning-from-examples methods assume that examples are given beforehand and all features are measured for each example. However, in many robotic domains the number of features that could be measured is very large, the cost of measuring those features is significant, and thus the robot must judiciously select which features it will measure. Finding a proper tradeoff between theaccuracy (e.g., number of prediction errors) andefficiency (e.g., cost of measuring features) during learning (prior to convergence) is an important part of the problem. Inspired by such robotic domains, this article considers realistic measurement costs of features in the process of incremental learning of classification knowledge. It proposes a unified framework for learning-from-examples methods that trade off accuracy for efficiency during learning, and analyzes two methods (CS-ID3 and CS-IBL) in detail. Moreover, this article illustrates the application of such a cost-sensitive-learning method to a real robot designed for anapproach-recognize task. The resulting robot learns to approach, recognize, and grasp objects on a floor effectively and efficiently. Experimental results show that highly accurate classification procedures can be learned without sacrificing efficiency in the case of both synthetic and real domains. 相似文献

6.

基于强化学习的推荐研究综述

余力杜启翰岳博妍向君瑶徐冠宇冷友方《计算机科学》2021,48(10):1-18

推荐系统致力于从海量数据中为用户寻找并自动推荐有价值的信息和服务,可有效解决信息过载问题,成为大数据时代一种重要的信息技术.但推荐系统的数据稀疏性、冷启动和可解释性等问题,仍是制约推荐系统广泛应用的关键技术难点.强化学习是一种交互学习技术,该方法通过与用户交互并获得反馈来实时捕捉其兴趣漂移,从而动态地建模用户偏好,可以... 相似文献

7.

多代理强化学习在智能教学系统中的应用

李洋《计算机与数字工程》2010,38(5):78-80,174

教学的个性化和智能化是智能教学系统研究的重点和难点。文章采用智能代理技术模拟系统中学生的智能和行为方式,将强化学习理论应用于多代理体,设计了结合资格迹理论的强化学习算法,并用以生成和调整适合于每个学生个体的教学内容和教学策略。多代理体技术实现了教学的个性化,强化学习算法使得教学策略具有智能化。实验结果表明,新的算法较原有算法更为有效。相似文献

8.

群机器人研究综述 总被引：2，自引：0，他引：2

薛颂东曾建潮《模式识别与人工智能》2008,21(2)

从计算机科学的角度对群机器人进行综述,概括其历史背景和目前的研究工作,回顾和展望其应用领域.群机器人是受社会性昆虫群体行为启发产生的新方法,通过建立经济、鲁棒、柔性和规模可伸缩的多机器人系统,协调控制相对简单的个体机器人合作完成规定的复杂任务.围绕群机器人的定义、与其他多机器人系统的区分准则及系统级功能特征,讨论群机器人研究中的主要问题:个体交互、通信、协作控制和自组织、自组装等,明确其研究概貌. 相似文献

9.

随机博弈框架下的多agent强化学习方法综述 总被引：4，自引：0，他引：4

宋梅萍顾国昌张国印《控制与决策》2005,20(10):1081-1090

多agent学习是在随机博弈的框架下,研究多个智能体间通过自学习掌握交互技巧的问题.单agent强化学习方法研究的成功,对策论本身牢固的数学基础以及在复杂任务环境中广阔的应用前景,使得多agent强化学习成为目前机器学习研究领域的一个重要课题.首先介绍了多agent系统随机博弈中基本概念的形式定义;然后介绍了随机博弈和重复博弈中学习算法的研究以及其他相关工作;最后结合近年来的发展,综述了多agent学习在电子商务、机器人以及军事等方面的应用研究,并介绍了仍存在的问题和未来的研究方向. 相似文献

10.

Distributed Deep Reinforcement Learning:A Survey and a Multi-player Multi-agent Learning Toolbox

《国际自动化与计算杂志》2024,21(3)

With the breakthrough of AlphaGo,deep reinforcement learning has become a recognized technique for solving sequential decision-making problems.Despite its reputation,data inefficiency caused by its trial and error learning mechanism makes deep rein-forcement learning difficult to apply in a wide range of areas.Many methods have been developed for sample efficient deep reinforce-ment learning,such as environment modelling,experience transfer,and distributed modifications,among which distributed deep rein-forcement learning has shown its potential in various applications,such as human-computer gaming and intelligent transportation.In this paper,we conclude the state of this exciting field,by comparing the classical distributed deep reinforcement learning methods and studying important components to achieve efficient distributed learning,covering single player single agent distributed deep reinforce-ment learning to the most complex multiple players multiple agents distributed deep reinforcement learning.Furthermore,we review re-cently released toolboxes that help to realize distributed deep reinforcement learning without many modifications of their non-distrib-uted versions.By analysing their strengths and weaknesses,a multi-player multi-agent distributed deep reinforcement learning toolbox is developed and released,which is further validated on Wargame,a complex environment,showing the usability of the proposed tool-box for multiple players and multiple agents distributed deep reinforcement learning under complex games.Finally,we try to point out challenges and future trends,hoping that this brief review can provide a guide or a spark for researchers who are interested in distrib-uted deep reinforcement learning. 相似文献

11.

Feature Selection and Feature Learning for High-dimensional Batch Reinforcement Learning: A Survey

De-Rong Liu Hong-Liang Li Ding Wang 《国际自动化与计算杂志》2015,(3):229-242

Tremendous amount of data are being generated and saved in many complex engineering and social systems every day.It is significant and feasible to utilize the big data to make better decisions by machine learning techniques. In this paper, we focus on batch reinforcement learning(RL) algorithms for discounted Markov decision processes(MDPs) with large discrete or continuous state spaces, aiming to learn the best possible policy given a fixed amount of training data. The batch RL algorithms with handcrafted feature representations work well for low-dimensional MDPs. However, for many real-world RL tasks which often involve high-dimensional state spaces, it is difficult and even infeasible to use feature engineering methods to design features for value function approximation. To cope with high-dimensional RL problems, the desire to obtain data-driven features has led to a lot of works in incorporating feature selection and feature learning into traditional batch RL algorithms. In this paper, we provide a comprehensive survey on automatic feature selection and unsupervised feature learning for high-dimensional batch RL. Moreover, we present recent theoretical developments on applying statistical learning to establish finite-sample error bounds for batch RL algorithms based on weighted Lpnorms. Finally, we derive some future directions in the research of RL algorithms, theories and applications. 相似文献

12.

Robotics Applications in Manufacturing

Mohsen Attaran 《Information Systems Management》1990,7(1):14-21

Changes in national and international competition, coupled with the mass application of new technologies, are revolutionizing production. Consumers are demanding quality, and manufacturers are being forced to provide it. Companies that wish to remain competitive and maintain their market share are turning to such advanced technologies as computer-aided industrial robots. This article discusses the benefits and justification of the industrial robot and provides managers in manufacturing firms with an overview of its potential applications. 相似文献

13.

群机器人通信研究综述

王亚超薛颂东曾建潮《工业控制计算机》2011,24(9):64-66

为了完成规定任务,群机器人中的个体机器人之间须进行协调合作,这有赖于机器人在有限感知和局部交互基础上的群体智能行为涌现,故对群机器人系统中的通信研究现状进行述评。首先对通信机制进行了划分,指出隐式通信的效率较显式通信为低,且作为人工系统的群机器人中更多采用显式通信机制。针对显式通信,通过对通信方式、网络拓扑结构、通信协议及通信语言等主要内容进行阐述,明确群机器人中通信研究的现状,并对该领域的研究方向进行了展望。相似文献

14.

Editorial Note: Big Multimedia Data in Robotics Applications

《Multimedia Tools and Applications》2018,77(9):10391-10391

相似文献

15.

Reinforcement Learning: An Introduction

《Neural Networks, IEEE Transactions on》2005,16(1):285-286

相似文献

16.

视频传输技术在遥操作机器人中的应用 总被引：2，自引：0，他引：2

蒋毅彭刚黄心汉《现代计算机》2002,(5):21-24,35

本文介绍了视频传输技术在基于网络的操作机器人中的应用，讨论了Winsock网络通信、视频压缩、视频传输及其控制技术。实验表明：本文所提出的方法能有效解决遥操作机器人系统对视频监控的要求。相似文献

17.

Sapphire: Statistical Characterization and Model-Based Adaptation of Networked Applications

《Parallel and Distributed Systems, IEEE Transactions on》2006,17(12):1512-1525

Many modern networked applications require specific levels of service quality from the underlying network. Moreover, next-generation networked applications are expected to adapt to changes in the underlying network, services, and user interactions. While some applications have built-in adaptivity, the adaptation itself requires specification of a system model. This paper presents Sapphire, an experimental approach for systematic model generation for application adaptation within a target network. It employs a nearly-automated, statistical design of experiments to characterize the relationships of both application and network-level parameters. First, it applies the Analysis of Variance (ANOVA) method to identify the most significant parameters and their interactions that affect performance. Next, it generates a model of application performance with respect to these parameters within the ranges of measurements. The key benefit of the framework is the integration of several well-established concepts of statistical modeling and distributed systems in the form of simple APIs so that existing applications can take advantage of it. We demonstrate the usefulness and flexibility of Sapphire by generating a performance model of an audio streaming application. We show that many existing multimedia and QoS-sensitive applications can exploit a statistical modeling approach such as Sapphire to incorporate application adaptivity. The approach can also be used for feedback control of distributed applications, tuning network and application parameters to achieve service levels in a target network. 相似文献

18.

Real-World Robotics: Learning to Plan for Robust Execution

Bennett Scott W. DeJong Gerald F. 《Machine Learning》1996,23(2-3):121-161

In executing classical plans in the real world, small discrepancies between a planner's internal representations and the real world are unavoidable. These can conspire to cause real-world failures even though the planner is sound and, therefore, proves that a sequence of actions achieves the goal. Permissive planning, a machine learning extension to classical planning, is one response to this difficulty. This paper describes the permissive planning approach and presents GRASPER, a permissive planning robotic system that learns to robustly pick up novel objects. 相似文献

19.

单agent强化学习与多agent强化学习比较研究

吴元斌《电脑与信息技术》2009,17(1):8-11

学习、交互及其结合是建立健壮、自治agent的关键必需能力。强化学习是agent学习的重要部分,agent强化学习包括单agent强化学习和多agent强化学习。文章对单agent强化学习与多agent强化学习进行了比较研究,从基本概念、环境框架、学习目标、学习算法等方面进行了对比分析,指出了它们的区别和联系,并讨论了它们所面临的一些开放性的问题。相似文献

20.

Quantum Reinforcement Learning

《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2008,38(5):1207-1220

The key approaches for machine learning, particularly learning in unknown probabilistic environments, are new representations and computation mechanisms. In this paper, a novel quantum reinforcement learning (QRL) method is proposed by combining quantum theory and reinforcement learning (RL). Inspired by the state superposition principle and quantum parallelism, a framework of a value-updating algorithm is introduced. The state (action) in traditional RL is identified as the eigen state (eigen action) in QRL. The state (action) set can be represented with a quantum superposition state, and the eigen state (eigen action) can be obtained by randomly observing the simulated quantum state according to the collapse postulate of quantum measurement. The probability of the eigen action is determined by the probability amplitude, which is updated in parallel according to rewards. Some related characteristics of QRL such as convergence, optimality, and balancing between exploration and exploitation are also analyzed, which shows that this approach makes a good tradeoff between exploration and exploitation using the probability amplitude and can speedup learning through the quantum parallelism. To evaluate the performance and practicability of QRL, several simulated experiments are given, and the results demonstrate the effectiveness and superiority of the QRL algorithm for some complex problems. This paper is also an effective exploration on the application of quantum computation to artificial intelligence. 相似文献