期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

阮晓钢蔡建羡陈静《计算机测量与控制》2009,17(2):321-323

两轮机器人是一个典型的不稳定,非线性,强耦合的自平衡系统,在两轮机器人系统模型未知和没有先验经验的条件下,将强化学习算法和模糊神经网络有效结合,保证了函数逼近的快速性和收敛性,成功地实现两轮机器人的自学习平衡控制,并解决了两轮机器人连续状态空间和动作空间的强化学习问题;仿真和实验表明:该方法不仅在很短的时间内成功地完成对两轮机器人的平衡控制,而且在两轮机器人参数变化较大时,仍能维持两轮机器人的平衡。相似文献

2.

强化学习在足球机器人基本动作学习中的应用 总被引：1，自引：0，他引：1

段勇杨淮清崔宝侠徐心和《机器人》2008,30(5):1

主要研究了强化学习算法及其在机器人足球比赛技术动作学习问题中的应用．强化学习的状态空间和动作空间过大或变量连续,往往导致学习的速度过慢甚至难于收敛．针对这一问题,提出了基于T-S 模型模糊神经网络的强化学习方法,能够有效地实现强化学习状态空间到动作空间的映射．此外,使用提出的强化学习方法设计了足球机器人的技术动作,研究了在不需要专家知识和环境模型情况下机器人的行为学习问题．最后,通过实验证明了所研究方法的有效性,其能够满足机器人足球比赛的需要．相似文献

3.

漂浮基双臂空间机器人系统的模糊神经网络自学习控制 总被引：7，自引：0，他引：7

洪昭斌陈力《机器人》2008,30(5):1

讨论了载体位置、姿态均不受控制的情况下自由漂浮双臂空间机器人系统的高斯基模糊神经网络自学习控制问题．此类空间机器人系统严格遵守动量守恒和角动量守恒,所以其动力学方程表现出强烈的非线性性质．将神经网络与模糊控制相结合,即利用神经网络进行模糊推理, 可使模糊控制具有自学习能力．在此基础上, 设计了双臂空间机器人系统关节空间的高斯基模糊神经网络自学习控制方案．系统的数值仿真证实了该方法的有效性．相似文献

4.

模糊CMAC的柔性空间机器人轨迹跟踪自学习控制

张文辉周启航齐乃明《智能系统学报》2012,(5):457-461

针对不确定自由漂浮柔性空间机器人系统,采用模糊CMAC神经网络自学习控制策略来解决轨迹跟踪控制问题.首先建立漂浮基空间机器人的动力学方程,然后利用具有快速学习能力的模糊CMAC神经网络来逼近非线性柔性臂的逆动力学模型.网络参数采用改进的有监督的Hebb学习规则进行自适应在线调整,并通过关联搜索进行自学习和自组织,其误差代价函数由PID控制器提供.仿真结果表明,这种模糊CMAC逆模PID控制器能够达到较高的控制精度,具有一定的工程应用价值. 相似文献

5.

自适应模糊RBF神经网络的多智能体机器人强化学习 总被引：3，自引：0，他引：3

张文志李智军吕恬生罗青《计算机工程与应用》2003,39(32):111-115

多机器人环境中的学习,由于机器人所处的环境是连续状态,连续动作,而且包含多个机器人,因此学习空间巨大,直接应用Q学习算法难以获得满意的结果。文章研究中针对多智能体机器人系统的学习问题,提出自适应模糊RBF神经网络强化学习算法,网络本身具有模糊推理能力、较强的函数逼近能力以及泛化能力,因此,实现了人类专家知识与机器学习方法的结合,减少学习问题的复杂度;实现连续状态空间与动作空间的策略学习。相似文献

6.

进化强化学习及其在机器人路径跟踪中的应用 总被引：3，自引：1，他引：2

段勇崔宝侠徐心和《控制与决策》2009,24(4)

研究了一种基于自适应启发评价(AHC)强化学习的移动机器人路径跟踪控制方法.AHC的评价单元(ACE)采用多层前向神经网络来实现.将TD(λ)算法和梯度下降法相结合来更新神经网络的权值.AHC的动作选择单元(ASE)由遗传算法优化的模糊推理系统(FIS)构成.ACE网络的输出构成二次强化信号,用于指导ASE的学习.最后将所提出的算法应用于移动机器人的行为学习,较好地解决了机器人的复杂路径跟踪问题. 相似文献

7.

一种模糊强化学习算法及其在RoboCup中的应用 总被引：1，自引：0，他引：1

高建清王浩于磊方宝富《计算机工程与应用》2006,42(6):52-54

传统的强化学习算法只能解决离散状态空间和动作空间的学习问题。论文提出一种模糊强化学习算法,通过模糊推理系统将连续的状态空间映射到连续的动作空间,然后通过学习得到一个完整的规则库。这个规则库为Agent的行为选择提供了先验知识,通过这个规则库可以实现动态规划。作者在RoboCup环境中验证了这个算法,实现了踢球策略的优化。相似文献

8.

多机器人动态编队的强化学习算法研究 总被引：8，自引：0，他引：8

王醒策张汝波顾国昌《计算机研究与发展》2003,40(10):1444-1450

在人工智能领域中，强化学习理论由于其自学习性和自适应性的优点而得到了广泛关注．随着分布式人工智能中多智能体理论的不断发展，分布式强化学习算法逐渐成为研究的重点．首先介绍了强化学习的研究状况，然后以多机器人动态编队为研究模型，阐述应用分布式强化学习实现多机器人行为控制的方法．应用SOM神经网络对状态空间进行自主划分，以加快学习速度；应用BP神经网络实现强化学习，以增强系统的泛化能力；并且采用内、外两个强化信号兼顾机器人的个体利益及整体利益．为了明确控制任务，系统使用黑板通信方式进行分层控制．最后由仿真实验证明该方法的有效性．相似文献

9.

基于Q-学习的模糊神经网络控制器

陈忠泽林良明颜国正《计算机工程与应用》2002,38(19):93-96

神经模糊系统在机器人的智能控制中具有巨大的应用潜力,但已有的系统构造方法几乎都面临着样本资源匮乏这一巨大困难。为克服传统系统构造方法可能因样本获取困难而引起的“维数灾难”等问题,该文在模糊神经网络中引入了Q-学习机制,提出了一种基于Q-学习的模糊神经网络模型,从而赋予神经模糊系统自学习能力。文章最后给出了其在菅野模糊小车控制中的仿真结果。实验表明,在神经模糊系统中融入智能学习机制Q-学习是行之有效的;它可以被用来实现机器人智能行为的自学习。值得一提的是,该文的仿真实验在真实系统上同样是容易实现的,只要系统能提供作为评价信号的传感信息即可。相似文献

10.

动作预测在多机器人强化学习协作中的应用

下载免费PDF全文

曹洁朱宁宁《计算机工程与应用》2013,49(8):257-260

在多机器人系统中,协作环境探索的强化学习的空间规模是机器人个数的指数函数,学习空间非常庞大造成收敛速度极慢。为了解决这个问题,将基于动作预测的强化学习方法及动作选择策略应用于多机器人协作研究中,通过预测机器人可能执行动作的概率以加快学习算法的收敛速度。实验结果表明,基于动作预测的强化学习方法能够比原始算法更快速地获取多机器人的协作策略。相似文献

11.

Application of reinforcement learning in robot soccer

《Engineering Applications of Artificial Intelligence》2007,20(7):936-950

The robot soccer game has been proposed as a benchmark problem for the artificial intelligence and robotic researches. Decision-making system is the most important part of the robot soccer system. As the environment is dynamic and complex, one of the reinforcement learning (RL) method named FNN-RL is employed in learning the decision-making strategy. The FNN-RL system consists of the fuzzy neural network (FNN) and RL. RL is used for structure identification and parameters tuning of FNN. On the other hand, the curse of dimensionality problem of RL can be solved by the function approximation characteristics of FNN. Furthermore, the residual algorithm is used to calculate the gradient of the FNN-RL method in order to guarantee the convergence and rapidity of learning. The complex decision-making task is divided into multiple learning subtasks that include dynamic role assignment, action selection, and action implementation. They constitute a hierarchical learning system. We apply the proposed FNN-RL method to the soccer agents who attempt to learn each subtask at the various layers. The effectiveness of the proposed method is demonstrated by the simulation and the real experiments. 相似文献

12.

Supervised fuzzy reinforcement learning for robot navigation

《Applied Soft Computing》2016

This paper addresses a new method for combination of supervised learning and reinforcement learning (RL). Applying supervised learning in robot navigation encounters serious challenges such as inconsistent and noisy data, difficulty for gathering training data, and high error in training data. RL capabilities such as training only by one evaluation scalar signal, and high degree of exploration have encouraged researchers to use RL in robot navigation problem. However, RL algorithms are time consuming as well as suffer from high failure rate in the training phase. Here, we propose Supervised Fuzzy Sarsa Learning (SFSL) as a novel idea for utilizing advantages of both supervised and reinforcement learning algorithms. A zero order Takagi–Sugeno fuzzy controller with some candidate actions for each rule is considered as the main module of robot's controller. The aim of training is to find the best action for each fuzzy rule. In the first step, a human supervisor drives an E-puck robot within the environment and the training data are gathered. In the second step as a hard tuning, the training data are used for initializing the value (worth) of each candidate action in the fuzzy rules. Afterwards, the fuzzy Sarsa learning module, as a critic-only based fuzzy reinforcement learner, fine tunes the parameters of conclusion parts of the fuzzy controller online. The proposed algorithm is used for driving E-puck robot in the environment with obstacles. The experiment results show that the proposed approach decreases the learning time and the number of failures; also it improves the quality of the robot's motion in the testing environments. 相似文献

13.

强化学习在移动机器人自主导航中的应用 总被引：1，自引：1，他引：1

下载免费PDF全文

秦政丁福光边信黔《计算机工程与应用》2007,43(18):215-217

概述了移动机器人常用的自主导航算法及其优缺点,在此基础上提出了强化学习方法。描述了强化学习算法的原理,并实现了用神经网络解决泛化问题。设计了基于障碍物探测传感器信息的机器人自主导航强化学习方法,给出了学习算法中各要素的数学模型。经仿真验证,算法正确有效,具有良好的收敛性和泛化能力。相似文献

14.

Real-world reinforcement learning for autonomous humanoid robot docking

Nicolás Navarro-Guerrero Cornelius Weber Pascal Schroeter Stefan Wermter 《Robotics and Autonomous Systems》2012,60(11):1400-1407

Reinforcement learning (RL) is a biologically supported learning paradigm, which allows an agent to learn through experience acquired by interaction with its environment. Its potential to learn complex action sequences has been proven for a variety of problems, such as navigation tasks. However, the interactive randomized exploration of the state space, common in reinforcement learning, makes it difficult to be used in real-world scenarios. In this work we describe a novel real-world reinforcement learning method. It uses a supervised reinforcement learning approach combined with Gaussian distributed state activation. We successfully tested this method in two real scenarios of humanoid robot navigation: first, backward movements for docking at a charging station and second, forward movements to prepare grasping. Our approach reduces the required learning steps by more than an order of magnitude, and it is robust and easy to be integrated into conventional RL techniques. 相似文献

15.

Developing reinforcement learning for adaptive co-construction of continuous high-dimensional state and action spaces

Masato Nagayoshi Hajime Murao Hisashi Tamaki 《Artificial Life and Robotics》2012,17(2):204-210

Engineers and researchers are paying more attention to reinforcement learning (RL) as a key technique for realizing adaptive and autonomous decentralized systems. In general, however, it is not easy to put RL into practical use. Our approach mainly deals with the problem of designing state and action spaces. Previously, an adaptive state space construction method which is called a ??state space filter?? and an adaptive action space construction method which is called ??switching RL??, have been proposed after the other space has been fixed. Then, we have reconstituted these two construction methods as one method by treating the former method and the latter method as a combined method for mimicking an infant??s perceptual and motor developments and we have proposed a method which is based on introducing and referring to ??entropy??. In this paper, a computational experiment was conducted using a so-called ??robot navigation problem?? with three-dimensional continuous state space and two-dimensional continuous action space which is more complicated than a so-called ??path planning problem??. As a result, the validity of the proposed method has been confirmed. 相似文献

16.

SSPQL: Stochastic shortest path-based Q-learning

Woo Young Kwon Il Hong Suh Sanghoon Lee 《International Journal of Control, Automation and Systems》2011,9(2):328-338

Reinforcement learning (RL) has been widely used as a mechanism for autonomous robots to learn state-action pairs by interacting with their environment. However, most RL methods usually suffer from slow convergence when deriving an optimum policy in practical applications. To solve this problem, a stochastic shortest path-based Q-learning (SSPQL) is proposed, combining a stochastic shortest path-finding method with Q-learning, a well-known model-free RL method. The rationale is, if a robot has an internal state-transition model which is incrementally learnt, then the robot can infer the local optimum policy by using a stochastic shortest path-finding method. By increasing state-action pair values comprising of these local optimum policies, a robot can then reach a goal quickly and as a result, this process can enhance convergence speed. To demonstrate the validity of this proposed learning approach, several experimental results are presented in this paper. 相似文献

17.

Dyna-Q-based vector direction for path planning problem of autonomous mobile robots in unknown environments

Hoang Huu Viet Sang Hyeok An 《Advanced Robotics》2013,27(3):159-173

Reinforcement learning (RL) is a popular method for solving the path planning problem of autonomous mobile robots in unknown environments. However, the primary difficulty faced by learning robots using the RL method is that they learn too slowly in obstacle-dense environments. To more efficiently solve the path planning problem of autonomous mobile robots in such environments, this paper presents a novel approach in which the robot’s learning process is divided into two phases. The first one is to accelerate the learning process for obtaining an optimal policy by developing the well-known Dyna-Q algorithm that trains the robot in learning actions for avoiding obstacles when following the vector direction. In this phase, the robot’s position is represented as a uniform grid. At each time step, the robot performs an action to move to one of its eight adjacent cells, so the path obtained from the optimal policy may be longer than the true shortest path. The second one is to train the robot in learning a collision-free smooth path for decreasing the number of the heading changes of the robot. The simulation results show that the proposed approach is efficient for the path planning problem of autonomous mobile robots in unknown environments with dense obstacles. 相似文献

18.

Adaptation technique for integrating genetic programming and reinforcement learning for real robots

Kamio S. Iba H. 《Evolutionary Computation, IEEE Transactions on》2005,9(3):318-333

We propose an integrated technique of genetic programming (GP) and reinforcement learning (RL) to enable a real robot to adapt its actions to a real environment. Our technique does not require a precise simulator because learning is achieved through the real robot. In addition, our technique makes it possible for real robots to learn effective actions. Based on this proposed technique, we acquire common programs, using GP, which are applicable to various types of robots. Through this acquired program, we execute RL in a real robot. With our method, the robot can adapt to its own operational characteristics and learn effective actions. In this paper, we show experimental results from two different robots: a four-legged robot "AIBO" and a humanoid robot "HOAP-1." We present results showing that both effectively solved the box-moving task; the end result demonstrates that our proposed technique performs better than the traditional Q-learning method. 相似文献

19.

A topological reinforcement learning agent for navigation

Arthur P. S. Braga Aluízio F. R. Araújo 《Neural computing & applications》2003,12(3-4):220-236

This article proposes a reinforcement learning procedure for mobile robot navigation using a latent-like learning schema. Latent learning refers to learning that occurs in the absence of reinforcement signals and is not apparent until reinforcement is introduced. This concept considers that part of a task can be learned before the agent receives any indication of how to perform such a task. In the proposed topological reinforcement learning agent (TRLA), a topological map is used to perform the latent learning. The propagation of the reinforcement signal throughout the topological neighborhoods of the map permits the estimation of a value function which takes in average less trials and with less updatings per trial than six of the main temporal difference reinforcement learning algorithms: Q-learning, SARSA, Q(λ)-learning, SARSA(λ), Dyna-Q and fast Q(λ)-learning. The RL agents were tested in four different environments designed to consider a growing level of complexity in accomplishing navigation tasks. The tests suggested that the TRLA chooses shorter trajectories (in the number of steps) and/or requires less value function updatings in each trial than the other six reinforcement learning (RL) algorithms. 相似文献