首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper deals with a new approach based on Q-learning for solving the problem of mobile robot path planning in complex unknown static environments.As a computational approach to learning through interaction with the environment,reinforcement learning algorithms have been widely used for intelligent robot control,especially in the field of autonomous mobile robots.However,the learning process is slow and cumbersome.For practical applications,rapid rates of convergence are required.Aiming at the problem of slow convergence and long learning time for Q-learning based mobile robot path planning,a state-chain sequential feedback Q-learning algorithm is proposed for quickly searching for the optimal path of mobile robots in complex unknown static environments.The state chain is built during the searching process.After one action is chosen and the reward is received,the Q-values of the state-action pairs on the previously built state chain are sequentially updated with one-step Q-learning.With the increasing number of Q-values updated after one action,the number of actual steps for convergence decreases and thus,the learning time decreases,where a step is a state transition.Extensive simulations validate the efficiency of the newly proposed approach for mobile robot path planning in complex environments.The results show that the new approach has a high convergence speed and that the robot can find the collision-free optimal path in complex unknown static environments with much shorter time,compared with the one-step Q-learning algorithm and the Q(λ)-learning algorithm.  相似文献   

2.
为了解决未知环境下的单目视觉移动机器人目标跟踪问题,提出了一种将目标状态估计与机器人可观性控制相结合的机器人同时定位、地图构建与目标跟踪方法。在状态估计方面,以机器人单目视觉同时定位与地图构建为基础,设计了扩展式卡尔曼滤波框架下的目标跟踪算法;在机器人可观性控制方面,设计了基于目标协方差阵更新最大化的优化控制方法。该方法能够实现机器人在单目视觉条件下对自身状态、环境状态、目标状态的同步估计以及目标跟随。仿真和原型样机实验验证了目标状态估计和机器人控制之间的耦合关系,证明了方法的准确性和有效性,结果表明:机器人将产生螺旋状机动运动轨迹,同时,目标跟踪和机器人定位精度与机器人机动能力成正比例关系。  相似文献   

3.
《Advanced Robotics》2012,26(17):2021-2041
Abstract

The calibration parameters of a mobile robot play a substantial role in navigation tasks. Often these parameters are subject to variations that depend either on changes in the environment or on the load of the robot. In this paper, we propose an approach to simultaneously estimate a map of the environment, the position of the on-board sensors of the robot, and its kinematic parameters. Our method requires no prior knowledge about the environment and relies only on a rough initial guess of the parameters of the platform. The proposed approach estimates the parameters online and it is able to adapt to non-stationary changes of the configuration. We tested our approach in simulated environments and on a wide range of real-world data using different types of robotic platforms.  相似文献   

4.
Exploration of unknown environments using autonomous mobile robots is essential in various scenarios such as, for instance, search and rescue missions following natural disasters. The task consists essentially in transversing the environment to build a complete and accurate map of it, and different applications may demand different exploration strategies. In the literature, the most used strategy is a simple greedy approach which visits closest unknown sites first, without considering whether they will likely yield significant information gain about the environment. In this paper, we propose a navigation strategy for efficient exploration of unknown environments that, based on local structures present in the map built so far, uses Shannon entropy to estimate the expected information gain of exploring each candidate frontier. A key advantage of our method over the state of the art is that it allows for the robot to simultaneously (i) select a destination likely to be most informative among all candidate frontiers; and (ii) compute its own path to that destination. This unified approach balances priority among candidate frontiers with highly expected information gain and those closer to the current position of the robot. We thoroughly evaluate our methodology in several experiments in a simulated environment, showing that our approach provides faster information gain about the environment when compared to other exploration strategies.  相似文献   

5.
Learning in the mobile robot domain is a very challenging task, especially in non-stationary conditions. The behavior-based approach has proven to be useful in making mobile robots work in real-world situations. Since the behaviors are responsible for managing the interactions between the robots and its environment, observing their use can be exploited to model these interactions. In our approach, the robot is initially given a set of behavior-producing modules to choose from, and the algorithm provides a memory-based approach to dynamically adapt the selection of these behaviors according to the history of their use. The approach is validated using a vision- and sonar-based Pioneer I robot in non-stationary conditions, in the context of a multi-robot foraging task. Results show the effectiveness of the approach in taking advantage of any regularities experienced in the world, leading to fas t and adaptable specialization for the learning robot.  相似文献   

6.
为了控制移动机器人在人群密集的复杂环境中高效友好地完成避障任务,本文提出了一种人群环境中基于深度强化学习的移动机器人避障算法。首先,针对深度强化学习算法中值函数网络学习能力不足的情况,基于行人交互(crowd interaction)对值函数网络做了改进,通过行人角度网格(angel pedestrian grid)对行人之间的交互信息进行提取,并通过注意力机制(attention mechanism)提取单个行人的时序特征,学习得到当前状态与历史轨迹状态的相对重要性以及对机器人避障策略的联合影响,为之后多层感知机的学习提供先验知识;其次,依据行人空间行为(human spatial behavior)设计强化学习的奖励函数,并对机器人角度变化过大的状态进行惩罚,实现了舒适避障的要求;最后,通过仿真实验验证了人群环境中基于深度强化学习的移动机器人避障算法在人群密集的复杂环境中的可行性与有效性。  相似文献   

7.
针对目前室内移动机器人沿墙走算法过于复杂、路径易重复、不能完全遍历、效率低等问题, 采用室内未知环境下结合历史状态的机器人沿墙高效遍历研究来解决这些问题. 该算法由移动机器人的上一个周期历史环境运动状态(分8类)、当前环境运动状态(分8类)和旋向信息(分2类)建立运动规则库, 沿墙行走时移动机器人时时采集这三类信息(上一个周期历史环境运动状态、当前环境运动状态和旋向信息)决定移动机器人当前的运动方向, 如此循环直到完成指定的沿墙任务. 最后对该算法进行了仿真与实际实验, 实验结果证明该算法可以在不同的、复杂的环境中高效、快速地完成沿墙走的任务, 并且对室内未知环境有很好的适应性.  相似文献   

8.
《Advanced Robotics》2013,27(12-13):1761-1778
Over the last decade, particle filters have been applied with great success to a variety of state estimation problem. The standard particle filter suffers poor efficiency during the estimation process, especially in the global localization and kidnapped problem. In this paper, we proposed a novel information entropy-based adaptive approach to improve the efficiency of particle filters by adapting the number of particles. The information entropy-based adaptive particle filter approaches use the information entropy to present the uncertainty of a mobile robot to the environment. By continuously obtaining the sensor information, the robot gradually reduces the uncertainty to the environment and, therefore, reduces the particle number for the estimation process. We derived the mathematic equation relating the information entropy with particle number. Extensive localization experiments using a mobile robot showed that our approach yielded drastic improvements and efficiency performance over a standard particle filter with fixed particles and over other adaptive approaches.  相似文献   

9.
Behaviour-based models have been widely used to represent mobile robotic systems, which operate in uncertain dynamic environments and combine information from several sensory sources. The specification of complex mobile robotic applications is performed in such models by combining deliberative goal-oriented planning with reactive sensor driven operations. Consequently, the design of mobile robotic architectures requires the combination of time-constrained activities with deliberate time-consuming components. Furthermore, the temporal requirements of the reactive activities are variable and dependent on the environment (i.e. recognition processes) and/or on application parameters (i.e. process frequencies depend on robot speed).In this paper, a real-time mobile robotic architecture to cope with the functional and variable temporal characteristics of behaviour-based mobile robotic applications is proposed. Run-time flexibility is a main feature of the architecture that supports the variability of the temporal characteristics of the workload. The system has to be adapted to the environmental conditions, by adjusting robot control parameters (i.e. speed) or the system load (i.e. computational time), and take actions depending on it. This influence is focused on the ability of the system to select the appropriate activity to be executed depending on the available time, and, to change its behaviour depending on the environmental information. The flexibility of the system is allowed thanks to the definition of a real-time task model and the design of adaptation mechanisms for the regulation of the reactive load.The improvement of the robot quality of service (QoS) is another important aspect discussed in the paper. The architecture incorporates a quality of service manager (QoSM) that allows dynamically monitor analyse and improve the robot performances. Depending on the internal state, on the environment and on the objectives, the robot performance requirements vary (i.e. when the environment is overloaded, global map processes generating high-quality maps are required). The QoSM receives the performance requirements of the robot, and by adjusting the reactive load, the system enables the necessary slack time to schedule the more suitable deliberative processes and hence fulfilling the robot QoS. Moreover, the deliberative load can be scheduled by different heuristic strategies that provide answers of varying quality.  相似文献   

10.
This paper deals with the implementation of emotions in mobile robots performing a specified task in a group in order to develop intelligent behavior and easier forms of communication. The overall group performance depends on the individual performance, group communication, and the synchronization of cooperation. With their emotional capability, each robot can distinguish the changed environment, can understand a colleague robot’s state, and can adapt and react with a changed world. The adaptive behavior of a robot is derived from the dominating emotion in an intelligent manner. In our control architecture, emotion plays a role to select the control precedence among alternatives such as behavior modes, cooperation plans, and goals. Emotional interaction happens among the robots, and a robot is biased by the emotional state of a colleague robot in performing a task. Here, emotional control is used for a better understanding of the colleague’s internal state, for faster communication, and for better performance eliminating dead time. This work was presented in part at the 12th International Symposium on Artificial Life and Robotics, Oita, Japan, January 25–27, 2007  相似文献   

11.
We address the problem of controlling a mobile robot to explore a partially known environment. The robot’s objective is the maximization of the amount of information collected about the environment. We formulate the problem as a partially observable Markov decision process (POMDP) with an information-theoretic objective function, and solve it applying forward simulation algorithms with an open-loop approximation. We present a new sample-based approximation for mutual information useful in mobile robotics. The approximation can be seamlessly integrated with forward simulation planning algorithms. We investigate the usefulness of POMDP based planning for exploration, and to alleviate some of its weaknesses propose a combination with frontier based exploration. Experimental results in simulated and real environments show that, depending on the environment, applying POMDP based planning for exploration can improve performance over frontier exploration.  相似文献   

12.
为适应复杂环境下目标跟踪机器人高效运动规划需求,本文提出一种基于多智能体强化学习的专家型策略梯度(ML-DDPG) 方法。为此首先构建了基于最小化任务单元的分布式多Actor-Critic网络架构;随后针对机器人主动障碍清除和目标跟踪任务建立了强化学习运动学模型和视觉样本预处理机制,由此提出一种专家型策略引导的最优目标价值估计方法;进一步通过并行化训练与集中式经验共享,提升了算法的训练效率;最后在不同任务环境下测试了ML-DDPG 算法的目标跟踪与清障性能表现,和其它算法对比验证了其在陌生环境中良好的迁移与泛化能力。  相似文献   

13.
For a mobile robot to be practical, it needs to navigate in dynamically changing environments and manipulate objects in the environment with operating ease. The main challenges to satisfying these requirements in mobile robot research include the collection of robot environment information, storage and organization of this information, and fast task planning based on available information. Conventional approaches to these problems are far from satisfactory due to their requirement of high computation time. In this paper, we specifically address the problems of storage and organization of the environment information and fast task planning in the area of robotic research. We propose an special object-oriented data model (OODM) for information storage and management in order to solve the first problem. This model explicitly represents domain knowledge and abstracts a global perspective about the robot's dynamically changing environment. To solve the second problem, we introduce a fast task planning algorithm that fully uses domain knowledge related to robot applications and to the given environment. Our OODM based task planning method presents a general frame work and representation, into which domain specific information, domain decomposition methods and specific path planners can be tailored for different task planning problems. This method unifies and integrates the salient features from various areas such as database, artificial intelligence, and robot path planning, thus increasing the planning speed significantly  相似文献   

14.
Mobile service robots are designed to operate in dynamic and populated environments. To plan their missions and to perform them successfully, mobile robots need to keep track of relevant changes in the environment. For example, office delivery or cleaning robots must be able to estimate the state of doors or the position of waste-baskets in order to deal with the dynamics of the environment. In this paper we present a probabilistic technique for estimating the state of dynamic objects in the environment of a mobile robot. Our method matches real sensor measurements against expected measurements obtained by a sensor simulation to efficiently and accurately identify the most likely state of each object even if the robot is in motion. The probabilistic approach allows us to incorporate the robot’s uncertainty in its position into the state estimation process. The method has been implemented and tested on a real robot. We present different examples illustrating the efficiency and robustness of our approach.  相似文献   

15.
The paper deals with supervised robot navigation in known environments. The navigation task is divided into two parts, where one part of the navigation is done by the supervisor system i.e. the system sets the vector marks on the salient edges of the virtual environment map and guides the robot to reach these marks. Mobile robots have to perform a specific task according to the given paths and solve the local obstacles avoidance individually. The salient point’s detection, vector mark estimation and optimal path calculation are done on the supervisor computer using colored Petri nets. The proposed approach was extended to simulate a flexible manufacturing system consisting of swarm of 17 robots, 17 - warehouses and 17 - manufacturing places. Our experimental investigation showed that simulated mobile robots with proposed supervision system were efficiently moving on the planned path.  相似文献   

16.
Generating teams of robots that are able to perform their tasks over long periods of time requires the robots to be responsive to continual changes in robot team member capabilities and to changes in the state of the environment and mission. In this article, we describe the L-ALLIANCE architecture, which enables teams of heterogeneous robots to dynamically adapt their actions over time. This architecture, which is an extension of our earlier work on ALLIANCE, is a distributed, behavior-based architecture aimed for use in applications consisting of a collection of independent tasks. The key issue addressed in L-ALLIANCE is the determination of which tasks robots should select to perform during their mission, even when multiple robots with heterogeneous, continually changing capabilities are present on the team. In this approach, robots monitor the performance of their teammates performing common tasks, and evaluate their performance based upon the time of task completion. Robots then use this information throughout the lifetime of their mission to automatically update their control parameters. After describing the L-ALLIANCE architecture, we discuss the results of implementing this approach on a physical team of heterogeneous robots performing proof-of-concept box pushing experiments. The results illustrate the ability of L-ALLIANCE to enable lifelong adaptation of heterogeneous robot teams to continuing changes in the robot team member capabilities and in the environment.  相似文献   

17.
Within mobile robotics, one of the most dominant relationships to consider when implementing robot control code is the one between the robot’s sensors and its motors. When implementing such a relationship, efficiency and reliability are of crucial importance. The latter aspects often prove challenging due to the complex interaction between a robot and the environment in which it exists, frequently resulting in a time consuming iterative process where control code is redeveloped and tested many times before obtaining an optimal controller. In this paper, we address this challenge by implementing an alternative approach to control code generation, which first identifies the desired robot behaviour and represents the sensor-motor task algorithmically through system identification using the NARMAX modelling methodology. The control code is generated by task demonstration, where the sensory perception and velocities are logged and the relationship that exists between them is then modelled using system identification. This approach produces transparent control code through non-linear polynomial equations that can be mathematically analysed to obtain formal statements regarding specific inputs/outputs. We demonstrate this approach to control code generation and analyse its performance in dynamic environments.  相似文献   

18.
In this paper, we present an approach for directing a mobile robot under real-world conditions into a target position by means of pointing poses only. Because one important objective of our work is the development of a low-cost platform, only monocular vision at web-cam level should be employed. Our previous approach presented in Gross et al. (2006) [1], Richarz et al. (2007) [2] has been improved by several additional processing steps. Finally, a background subtraction technique and a histogram equalization have been integrated in the preprocessing stage to be able to work in environments with structured backgrounds and under variable lighting conditions. Furthermore, a discriminant analysis was used to find the most relevant input features for the pointing pose estimator. The contribution of this paper is, however, not only the presentation of an approach to estimating pointing poses in a demanding real-world scenario on a mobile robot, but also the detailed and evaluative comparison between different image-preprocessing techniques, alternative feature extraction methods, and several function approximators with the same set of test- and training data. Reasonable combinations of the different methods are tested, and for each component on the processing chain the effect on the accuracy of the target estimation is quantized. The approach presented in this paper has been implemented on the mobile interaction robot Horos to determine the performance and estimation accuracy under real-world conditions. Furthermore, we compared the accuracy of our approach with that of humans performing the same estimation task, and achieved very comparable results for the best estimator.  相似文献   

19.
In this article, we propose a localization scheme for a mobile robot based on the distance between the robot and moving objects. This method combines the distance data obtained from ultrasonic sensors in a mobile robot, and estimates the location of the mobile robot and the moving object. The movement of the object is detected by a combination of data and the object’s estimated position. Then, the mobile robot’s location is derived from the a priori known initial state. We use kinematic modeling that represents the movement of a robot and an object. A Kalman-filtering algorithm is used for addressing estimation error and measurement noise. Throughout the computer simulation experiments, the performance is verified. Finally, the results of experiments are presented and discussed. The proposed approach allows a mobile robot to seek its own position in a weakly structured environment. This work was presented in part at the 12th International Symposium on Artificial Life and Robotics, Oita, Japan, January 25–27, 2007  相似文献   

20.
Aiming at human-robot collaboration in manufacturing, the operator's safety is the primary issue during the manufacturing operations. This paper presents a deep reinforcement learning approach to realize the real-time collision-free motion planning of an industrial robot for human-robot collaboration. Firstly, the safe human-robot collaboration manufacturing problem is formulated into a Markov decision process, and the mathematical expression of the reward function design problem is given. The goal is that the robot can autonomously learn a policy to reduce the accumulated risk and assure the task completion time during human-robot collaboration. To transform our optimization object into a reward function to guide the robot to learn the expected behaviour, a reward function optimizing approach based on the deterministic policy gradient is proposed to learn a parameterized intrinsic reward function. The reward function for the agent to learn the policy is the sum of the intrinsic reward function and the extrinsic reward function. Then, a deep reinforcement learning algorithm intrinsic reward-deep deterministic policy gradient (IRDDPG), which is the combination of the DDPG algorithm and the reward function optimizing approach, is proposed to learn the expected collision avoidance policy. Finally, the proposed algorithm is tested in a simulation environment, and the results show that the industrial robot can learn the expected policy to achieve the safety assurance for industrial human-robot collaboration without missing the original target. Moreover, the reward function optimizing approach can help make up for the designed reward function and improve policy performance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号