首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
交通信号控制系统在物理位置和控制逻辑上分散于动态变化的网络交通环境,将每个路口的交通信号控制器看作一个异质的智能体,非常适合采用“无模型、自学习、数据驱动”的多智能体强化学习(MARL)方法建模与描述。为了解该方法的研究现状、存在问题及发展前景,系统跟踪了多智能体强化学习在国内外交通控制领域的具体应用,包括交通信号MARL控制概念模型、完全孤立的MARL控制、部分状态合作的MARL和动作联动的MARL控制,分析其技术特征和代际差异,讨论了多智体强化学习方法在交通信号控制中的研究动向,提出了发展网络交通信号多智能体强化学习集成控制的关键问题在于强化学习控制机理、联动协调性、交通状态特征抽取和多模式整合控制。  相似文献   

2.
深度强化学习(DRL)广泛应用于具有高度不确定性的城市交通信号控制问题中,但现有的DRL交通信号控制方法中,仅仅使用传统的深度神经网络,复杂交通场景下其感知能力有限。此外,状态作为强化学习的三要素之一,现有方法中的交通状态也需要人工精心的设计。因此,提出了一种基于注意力机制(attention mechanism)的DRL交通信号控制算法。通过引入注意力机制,使得神经网络自动地关注重要的状态分量以增强网络的感知能力,提升了信号控制效果,并减少了状态向量设计的难度。在SUMO(simulation of urban mobility)仿真平台上的实验结果表明,在单交叉口、多交叉口中,在低、高交通流量条件下,仅仅使用简单的交通状态,与三种基准信号控制算法相比,所提算法在平均等待时间、行驶时间等指标上都具有最好的性能。  相似文献   

3.
针对传统分布式自适应交通信号控制协调效率受限,并且存在维数灾难问题,建立了城市区域交通信号控制系统模型,将其优化问题建模为局部交叉口交通信号博弈协调控制,提出基于交叉口交通信号控制agent局部信息博弈交互的学习算法。在学习过程中交叉口交通信号控制agent进行局部信息博弈交互,自主调整交通信号控制策略使其逐步学习到最优策略。通过设计不同的交通需求情景,对路网平均延误和平均停车次数进行加权构建性能评价指标,相对于遗传算法和感应控制方法,博弈学习取得更好的交通信号控制效果,其能收敛到最优性能评价指标,其具有更好的交通需求管控能力。  相似文献   

4.
结合Q学习和模糊逻辑的单路口交通信号自学习控制方法*   总被引:1,自引:0,他引:1  
针对城市交通系统的动态性和不确定性,提出了基于强化学习的信号交叉口智能控制系统结构,对单交叉口动态实时控制进行了研究。将BP神经网络与Q学习算法相结合实现了路口的在线学习。同时,针对交通信号控制的多目标评价特征,采用基于模糊逻辑的Q学习奖惩信号设计方法,实施对交通信号的优化控制。最后,在三种交通场景下,应用Paramics微观交通仿真软件对典型十字路口进行仿真实验。结果表明,该方法对不同交通场景下的突变仍可保持较高的控制效率,控制效果明显优于定时控制。  相似文献   

5.
对于传统的交通信号无法有效解决当前城市交通堵塞问题,将Q学习与交通信号相结合的方式来解决此问题.对交通控制理论进行分析,对强化学习理论和Q学习算法的步骤进行研究,将交通信号与Q学习算法相结合,通过仿真实验结果得到Q学习算法与交通信号相结合优于当前的固定周期信号控制方法.  相似文献   

6.
当前在交通信号控制系统中引入智能化检测和控制已是大势所趋,特别是强化学习和深度强化学习方法在可扩展性、稳定性和可推广性等方面展现出巨大的技术优势,已成为该领域的研究热点。针对基于强化学习的交通信号控制任务进行了研究,在广泛调研交通信号控制方法研究成果的基础上,系统地梳理了强化学习和深度强化学习在智慧交通信号控制领域的分类及应用;并归纳了使用多智能体合作的方法解决大规模交通信号控制问题的可行方案,对大规模交通信号控制的交通场景影响因素进行了分类概述;从提高交通信号控制器性能的角度提出了本领域当前所面临的挑战和未来可能极具潜力的研究方向。  相似文献   

7.
交通信号机作为一种重要交通信号装置,对于提高道路通行能力,减少交通事故有明显效果。随着智能交通技术的发展,构建现场总线标准的交通信号控制节点显得很有必要,围绕符合CAN总线标准的交通信号控制节点设计.讨论了基于AT89C51单片机控制的交通信号控制节点的工作原理及其硬件电路设计,给出了CAN总线通信协议和通信流程.使监控站能够有效地通过CAN接口控制交通信号控制节点的工作状态。  相似文献   

8.
张辰  喻剑  何良华 《计算机科学》2016,43(8):171-176
Q学习在交通信号控制中具有广泛的应用。在区域交通中,基于Q学习的传统区域交通信号控制方法通过agent之间互相交流的方式获取周边路口信息,并作出最有利的决策。传统交通控制方法在大部分情况下具有良好的表现。然而,由于其对周边路口拥堵程度的回馈计算不准确,因此在周边路口堵塞程度相差较大时将出现决策失误,从而导致局部热点拥堵。针对该问题进行分析,并以传统的区域交通信号控制方法为基础,提出一种新的基于Q学习和动态权重的改进的区域交通信号控制方法,引入“路口权重”的概念,通过多目标组合法将其应用于回馈计算,且权重随路口实际交通情况动态改变,解决了易陷入局部热点拥堵的问题。应用仿真软件在3种不同的交通状况下进行模拟,结果表明,所提算法在“拥堵”的状况下较传统控制方法具有更突出的表现。  相似文献   

9.
城市单交叉路口交通信号实时优化控制与仿真   总被引:4,自引:0,他引:4       下载免费PDF全文
针对城市道路单点交叉口交通流的到达特性,将路口到达的交通流划分为4种状态,提出了“基于状态划分的多相位交通信号实时控制方法”,该方法根据路口各状态下交通流的到达特征和控制目标,为不同交通状态选择合适的性能指标,并建立各状态下交通信号的动态配时模型。同时,设计了一种改进的自适应实数编码遗传算法对交通信号配时模型进行求解,该算法采用基于分类的排序惩罚机制对约束进行处理,并引入模拟退火算子增强遗传算法的局部搜索。最后,采用3种算法对实例进行大量的数值计算和Paramics仿真,计算和仿真结果均表明所设计的算法求解精度高且模型具有良好的控制效果。  相似文献   

10.
交通信号的智能控制是智能交通研究中的热点问题。为更加及时有效地自适应协调交通,文中提出了一种基于分布式深度强化学习的交通信号控制模型,采用深度神经网络框架,利用目标网络、双Q网络、价值分布提升模型表现。将交叉路口的高维实时交通信息离散化建模并与相应车道上的等待时间、队列长度、延迟时间、相位信息等整合作为状态输入,在对相位序列及动作、奖励做出恰当定义的基础上,在线学习交通信号的控制策略,实现交通信号Agent的自适应控制。为验证所提算法,在SUMO(Simulation of Urban Mobility)中相同设置下,将其与3种典型的深度强化学习算法进行对比。实验结果表明,基于分布式的深度强化学习算法在交通信号Agent的控制中具有更好的效率和鲁棒性,且在交叉路口车辆的平均延迟、行驶时间、队列长度、等待时间等方面具有更好的性能表现。  相似文献   

11.
温凯歌  杨照辉 《计算机工程》2011,37(17):152-154
采用神经网络值函数逼近的强化学习方法处理交叉口的信号控制。根据交通流及交叉口信号特征,建立强化学习的状态空间、动作空间和回报空间,以最小化车辆在交叉口的延误为控制目标,对信号进行优化控制。引入小脑模型关节控制器神经网络对强化学习(RL)的Q值进行逼近。在变化的交通条件下,使用典型交叉口对提出的RL模型进行验证,同传统的定时控制和全感应控制进行对比分析。仿真结果表明,RL控制器具有较强的学习能力,可以适应交通流的动态变化,稳定性好、自适应性强,对于环境变化具有较强的适应能力。  相似文献   

12.
舒凌洲  吴佳  王晨 《计算机应用》2019,39(5):1495-1499
针对城市交通信号控制中如何有效利用相关信息优化交通控制并保证控制算法的适应性和鲁棒性的问题,提出一种基于深度强化学习的交通信号控制算法,利用深度学习网络构造一个智能体来控制整个区域交通。首先通过连续感知交通环境的状态来选择当前状态下可能的最优控制策略,环境的状态由位置矩阵和速度矩阵抽象表示,矩阵表示法有效地抽象出环境中的主要信息并减少了冗余信息;然后智能体以在有限时间内最大化车辆通行全局速度为目标,根据所选策略对交通环境的影响,利用强化学习算法不断修正其内部参数;最后,通过多次迭代,智能体学会如何有效地控制交通。在微观交通仿真软件Vissim中进行的实验表明,对比其他基于深度强化学习的算法,所提算法在全局平均速度、平均等待队长以及算法稳定性方面展现出更好的结果。其中,与基线相比,平均速度提高9%,平均等待队长降低约13.4%。实验结果证明该方法能够适应动态变化的复杂的交通环境。  相似文献   

13.
It is known that most of the key problems in visual servo control of robots are related to the performance analysis of the system considering measurement and modeling errors. In this paper, the development and performance evaluation of a novel intelligent visual servo controller for a robot manipulator using neural network Reinforcement Learning is presented. By implementing machine learning techniques into the vision based control scheme, the robot is enabled to improve its performance online and to adapt to the changing conditions in the environment. Two different temporal difference algorithms (Q-learning and SARSA) coupled with neural networks are developed and tested through different visual control scenarios. A database of representative learning samples is employed so as to speed up the convergence of the neural network and real-time learning of robot behavior. Moreover, the visual servoing task is divided into two steps in order to ensure the visibility of the features: in the first step centering behavior of the robot is conducted using neural network Reinforcement Learning controller, while the second step involves switching control between the traditional Image Based Visual Servoing and the neural network Reinforcement Learning for enabling approaching behavior of the manipulator. The correction in robot motion is achieved with the definition of the areas of interest for the image features independently in both control steps. Various simulations are developed in order to present the robustness of the developed system regarding calibration error, modeling error, and image noise. In addition, a comparison with the traditional Image Based Visual Servoing is presented. Real world experiments on a robot manipulator with the low cost vision system demonstrate the effectiveness of the proposed approach.  相似文献   

14.
针对多agent系统强化学习中,状态空间和动作空间随着agent个数的增加成指数倍增长,进而导致维数灾难、学习速度慢和收敛性差的问题,提出了一种新型的混合强化学习方法,用于改进传统的多agent协作强化学习;该算法基于Friend-or-Foe Q-学习,事先采用聚类分析法对状态空间和动作空间进行预处理,降低空间维数后再进行强化学习,这就避免了同等状态环境下的重复劳动和对动作集的盲目搜索,理论上大大提高了agent的学习速度和算法的收敛性;文章首先进行改进算法的思想概述,然后给出了改进算法的学习框架和算法的一般描述。  相似文献   

15.
Cooperative, hybrid agent architecture for real-time traffic signal control   总被引:1,自引:0,他引:1  
This paper presents a new hybrid, synergistic approach in applying computational intelligence concepts to implement a cooperative, hierarchical, multiagent system for real-time traffic signal control of a complex traffic network. The large-scale traffic signal control problem is divided into various subproblems, and each subproblem is handled by an intelligent agent with a fuzzy neural decision-making module. The decisions made by lower-level agents are mediated by their respective higher-level agents. Through adopting a cooperative distributed problem solving approach, coordinated control by the agents is achieved. In order for the multiagent architecture to adapt itself continuously to the dynamically changing problem domain, a multistage online learning process for each agent is implemented involving reinforcement learning, learning rate and weight adjustment as well as dynamic update of fuzzy relations using an evolutionary algorithm. The test bed used for this research is a section of the Central Business District of Singapore. The performance of the proposed multiagent architecture is evaluated against the set of signal plans used by the current real-time adaptive traffic control system. The multiagent architecture produces significant improvements in the conditions of the traffic network, reducing the total mean delay by 40% and total vehicle stoppage time by 50%.  相似文献   

16.
Reinforcement Learning has proven to be capable of solving complex tasks like playing video games, robotics control, speech or image recognition and processing. Transferring Reinforcement Learning into engineering design helps to overcome two current issues of data-driven Design Automation in engineering design. First, dealing with sparse training data resulting from differing design samples. Second, overcoming the limited number of samples in the training data as consequence of short or insufficient product history. To introduce an alternative approach for Design Automation, this contribution studies feasibility, training effort and transferability of Reinforcement Learning in engineering design. The presented method maps engineering requirements and parametric models into learning environments and provides a novel approach for design automation. In addition to that, the contribution summarises the hyperparameters, which design engineers have to set prior to training, and introduces a novel transfer learning concept for Reinforcement Learning in related design tasks. The support is probed by design tasks of performance-oriented bike parts. Case-independent indicators are presented to estimate the case-specific training effort, the effects of hyperparameter variation and the effects of transferring a pretrained agent to related design tasks. Finally, the findings are used to compare Reinforcement Learning to other data-independent Design Automation approaches to assess potential fields of application for Reinforcement Learning in engineering design.  相似文献   

17.
We present a neural method – based on the Hopfield net – for the modelling and control of over-saturated signalized intersections. The problem is to look, in real-time, for lights signal setting which minimize a given traffic criterion such as waiting time. The use of the Hopfield model is straightforward justified by its optimization capabilities, especially its fast time computing (by its own dynamic), which is of a great interest in real-time problems like the traffic control one. The original Hopfield algorithm is modified to take into account proper constraints of the traffic problem. This approach is illustrated by numerical examples of traffic conditions generated by a simulator. We extend the method to urban nets of several interconnected intersections.  相似文献   

18.
动态交通分配与信号控制的组合模型及算法研究   总被引:7,自引:0,他引:7  
This paper presents a generalized bi-level programming model of combined dynamic traffic assignment and traffic signal control, and especially analyzes a procedure for determining the equilibrium queuing delays on saturated links for dynamic network signal control satisfying the FIFO (first-in-first-out) rule. The chaotic optimal algorithm proposed in this paper can not only present the optimal signal settings, but also calculate, at each interval, the link inflow rates and outflow rates for the dynamic user optimal problem, and provide real-time information for the travelers. Finally, a numerical example is given to illustrate the application of the proposed model and solution algorithm, and comparison shows that this model has better system performance.  相似文献   

19.
Research on Combined Dynamic Traffic Assignment and Signal Control   总被引:2,自引:0,他引:2  
This paper presents a generalized bi-level programming model of combined dynamic traffic assignment and traffic signal control,and especially analyzes a procedure for determining the equilibrium queuing delays on saturated links for dynamic network signal control satisfying the FIFO (first-in-first-out)rule.The chaotic optimal algorithm proposed in this paper can not only present the optimal signal settings,but also calculate,at each interval,the link inflow rates and outflow rates for the dynamic user optimal problem,and provide real-time information for the travelers.Finally,a numerical example is given to illustrate the application of the proposed model and solution algorithm, and comparison shows that this model has better system performance.  相似文献   

20.
一种可信的自适应服务组合机制   总被引:7,自引:0,他引:7  
提出一种可信的自适应服务组合机制.首先,将组合服务的可信性保证问题转换为自适应控制问题,可信性保证策略作为可调节控制器,组合服务作为被控对象,并设计了相应的系统结构;其次,在马尔可夫决策过程框架下建模和优化组合服务的可信维护过程和策略,并设计了相应的算法,实现了基于强化学习的直接自适应控制机制;最后,通过仿真实验,将组合服务的自适应维护与随机维护策略比较,表明组合服务的自适应维护具有明显的优越性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号