期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

夏新海《计算机工程与应用》2020,56(23):245-252

针对传统分布式自适应交通信号控制协调效率受限,并且存在维数灾难问题,建立了城市区域交通信号控制系统模型,将其优化问题建模为局部交叉口交通信号博弈协调控制,提出基于交叉口交通信号控制agent局部信息博弈交互的学习算法。在学习过程中交叉口交通信号控制agent进行局部信息博弈交互,自主调整交通信号控制策略使其逐步学习到最优策略。通过设计不同的交通需求情景,对路网平均延误和平均停车次数进行加权构建性能评价指标,相对于遗传算法和感应控制方法,博弈学习取得更好的交通信号控制效果,其能收敛到最优性能评价指标,其具有更好的交通需求管控能力。相似文献

2.

基于注意力机制的深度强化学习交通信号控制

任安妮周大可冯锦浩唐慕尧李涛《计算机应用研究》2023,40(2)

深度强化学习（DRL）广泛应用于具有高度不确定性的城市交通信号控制问题中,但现有的DRL交通信号控制方法中,仅仅使用传统的深度神经网络,复杂交通场景下其感知能力有限。此外,状态作为强化学习的三要素之一,现有方法中的交通状态也需要人工精心的设计。因此,提出了一种基于注意力机制（attention mechanism）的DRL交通信号控制算法。通过引入注意力机制,使得神经网络自动地关注重要的状态分量以增强网络的感知能力,提升了信号控制效果,并减少了状态向量设计的难度。在SUMO（simulation of urban mobility）仿真平台上的实验结果表明,在单交叉口、多交叉口中,在低、高交通流量条件下,仅仅使用简单的交通状态,与三种基准信号控制算法相比,所提算法在平均等待时间、行驶时间等指标上都具有最好的性能。相似文献

3.

基于Q学习和动态权重的改进的区域交通信号控制方法

张辰喻剑何良华《计算机科学》2016,43(8):171-176

Q学习在交通信号控制中具有广泛的应用。在区域交通中,基于Q学习的传统区域交通信号控制方法通过agent之间互相交流的方式获取周边路口信息,并作出最有利的决策。传统交通控制方法在大部分情况下具有良好的表现。然而,由于其对周边路口拥堵程度的回馈计算不准确,因此在周边路口堵塞程度相差较大时将出现决策失误,从而导致局部热点拥堵。针对该问题进行分析,并以传统的区域交通信号控制方法为基础,提出一种新的基于Q学习和动态权重的改进的区域交通信号控制方法,引入“路口权重”的概念,通过多目标组合法将其应用于回馈计算,且权重随路口实际交通情况动态改变,解决了易陷入局部热点拥堵的问题。应用仿真软件在3种不同的交通状况下进行模拟,结果表明,所提算法在“拥堵”的状况下较传统控制方法具有更突出的表现。相似文献

4.

城市单交叉路口交通信号的控制方法研究

罗金玲《电脑编程技巧与维护》2016,(8)

对于传统的交通信号无法有效解决当前城市交通堵塞问题,将Q学习与交通信号相结合的方式来解决此问题.对交通控制理论进行分析,对强化学习理论和Q学习算法的步骤进行研究,将交通信号与Q学习算法相结合,通过仿真实验结果得到Q学习算法与交通信号相结合优于当前的固定周期信号控制方法. 相似文献

5.

基于Dueling Double DQN的交通信号控制方法

下载免费PDF全文

叶宝林陈栋刘春元陈滨吴维敏《计算机测量与控制》2024,32(7):154-161

为了提高交叉口通行效率缓解交通拥堵,深入挖掘交通状态信息中所包含的深层次隐含特征信息,提出了一种基于Dueling Double DQN (D3QN) 的单交叉口交通信号控制方法。构建了一个基于深度强化学习Double DQN(DDQN)的交通信号控制模型,对动作-价值函数的估计值和目标值迭代运算过程进行了优化,克服基于深度强化学习DQN的交通信号控制模型存在收敛速度慢的问题。设计了一个新的Dueling Network解耦交通状态和相位动作的价值,增强Double DQN (DDQN) 提取深层次特征信息的能力。基于微观仿真平台SUMO搭建了一个单交叉口模拟仿真框架和环境,开展仿真测试。仿真测试结果表明,与传统交通信号控制方法和基于深度强化学习DQN的交通信号控制方法相比,所提方法能够有效减少车辆平均等待时间、车辆平均排队长度和车辆平均停车次数,明显提升交叉口通行效率。相似文献

6.

Q学习中基于模糊规则的强化函数设计方法

赵晓华李振龙陈阳舟荣建《模式识别与人工智能》2008,21(2)

Q学习算法是求解信息不完全马尔可夫决策问题的一种强化学习方法.Q学习中强化信号的设计是影响学习效果的重要因素.本文提出一种基于模糊规则的Q学习强化信号的设计方法,提高强化学习的性能.并将该方法应用于单交叉口信号灯最优控制中,根据交通流的变化自适应调整交叉口信号灯的相位切换时间和相位次序.通过Paramics微观交通仿真软件验证,说明在解决交通控制问题中,使用基于模糊规则的Q学习的学习效果优于传统Q学习. 相似文献

7.

大规模智慧交通信号控制中的强化学习和深度强化学习方法综述

下载免费PDF全文

翟子洋郝茹茹董世浩《计算机应用研究》2024,41(6)

当前在交通信号控制系统中引入智能化检测和控制已是大势所趋,特别是强化学习和深度强化学习方法在可扩展性、稳定性和可推广性等方面展现出巨大的技术优势,已成为该领域的研究热点。针对基于强化学习的交通信号控制任务进行了研究,在广泛调研交通信号控制方法研究成果的基础上,系统地梳理了强化学习和深度强化学习在智慧交通信号控制领域的分类及应用;并归纳了使用多智能体合作的方法解决大规模交通信号控制问题的可行方案,对大规模交通信号控制的交通场景影响因素进行了分类概述;从提高交通信号控制器性能的角度提出了本领域当前所面临的挑战和未来可能极具潜力的研究方向。相似文献

8.

城市交通信号局部博弈交互下的学习协调控制

高涵罗娟蔡乾娅郑燕柳《计算机研究与发展》2023,18(12):2797-2805

智能交通信号控制系统是智慧交通系统（intelligent traffic system,ITS）的重要组成部分,为形成安全高效的交通环境提供实时服务. 然而,现有自适应交通信号控制方法因通信受限难以满足复杂多变的交通需求. 针对通信时延长和信号灯有效利用率低的难题,提出一种基于边缘计算的异步决策的多智能体交通信号自适应协调方法（adaptive coordination method,ADM）. 该方法基于提出的端—边—云架构实现实时采集环境信息,将异步通信引入强化学习的多智能体协调过程,设计一种多智能体之间使用不同决策周期的异步决策机制. 实验结果表明边缘计算技术为高实时性要求的交通信号控制场景提供一种良好的解决思路,此外,相较于固定配时和独立决策的Q学习决策方法IQA（independent Q-learning decision algorithm）而言,ADM方法基于异步决策机制和邻居信息库实现智能体之间的协作,达到降低车辆平均等待长度及提高交叉口时间利用率的目标.

相似文献

9.

面向交通信号的两层递阶控制解决方案

下载免费PDF全文

戈军周莲英《计算机工程与应用》2015,51(20):246-252

针对现有交通信号控制系统的诸多不足,提出了一种用于交通信号控制的两层递阶多Agent系统解决方案。通过将交通网络进行区域划分,利用底层Agent控制各交叉口,顶层Agent控制区域,从而实现两层递阶控制。底层Agent采用经典Q学习同步学习最优策略,顶层Agent利用Tile Coding非凡的连续空间处理能力,实现Q学习的动作值函数逼近方法。仿真实验结果表明,该分层递阶控制不但提高了交通信号控制系统效率,而且也为大规模应用提供了很好的可伸缩解决方案。相似文献

10.

基于深度强化学习的有轨电车信号优先控制

王云鹏郭戈《自动化学报》2019,45(12):2366-2377

现有的有轨电车信号优先控制系统存在诸多问题, 如无法适应实时交通变化、优化求解较为复杂等. 本文提出了一种基于深度强化学习的有轨电车信号优先控制策略. 不依赖于交叉口复杂交通建模, 采用实时交通信息作为输入, 在有轨电车整个通行过程中连续动态调整交通信号. 协同考虑有轨电车与社会车辆的通行需求, 在尽量保证有轨电车无需停车的同时, 降低社会车辆的通行延误. 采用深度Q网络算法进行问题求解, 并利用竞争架构、双Q网络和加权样本池改善学习性能. 基于SUMO的实验表明, 该模型能够有效地协同提高有轨电车与社会车辆的通行效率. 相似文献

11.

基于Q学习的单路口交通信号协调控制

胡宇刘美玲周子昂张敏《计算机与现代化》2020,(5):96-100,105

Q学习通过与外部环境的交互来进行单路口的交通信号自适应控制。在城市交通愈加拥堵的时代背景下,为了缓解交通拥堵,提出一种结合SCOOT系统对绿信比优化方法的Q学习算法。本文将SCOOT系统中对绿信比优化的方法与Q学习相结合,即通过结合车均延误率以及停车次数等时间因素以及经济因素2方面,建立新的数学模型来作为本算法的成本函数并建立一种连续的奖惩函数,在此基础上详细介绍Q学习算法在单路口上的运行过程并且通过与Webster延误率和基于最小车均延误率的Q学习进行横向对比,验证了此算法优于定时控制以及基于车均延误的Q学习算法。相对于这2种算法,本文提出的算法更加适合单路口的绿信比优化。相似文献

12.

基于强化学习的交通情景问题决策优化

罗飞白梦伟《计算机应用》2022,42(8):2361-2368

在复杂交通情景中求解出租车路径规划决策问题和交通信号灯控制问题时,传统强化学习算法在收敛速度和求解精度上存在局限性;因此提出一种改进的强化学习算法求解该类问题。首先,通过优化的贝尔曼公式和快速Q学习（SQL）机制,以及引入经验池技术和直接策略,提出一种改进的强化学习算法GSQL-DSEP;然后,利用GSQL-DSEP算法分别优化出租车路径规划决策问题中的路径长度与交通信号灯控制问题中的车辆总等待时间。相较于Q学习、快速Q学习（SQL）、、广义快速Q学习（GSQL）、Dyna-Q算法,GSQL-DSEP算法在性能测试中降低了至少18.7%的误差,在出租车路径规划决策问题中使决策路径长度至少缩短了17.4%,在交通信号灯控制问题中使车辆总等待时间最多减少了51.5%。实验结果表明,相较于对比算法,GSQL-DSEP算法对解决交通情景问题更具优势。相似文献

13.

Short-term traffic flow forecasting: parametric and nonparametric approaches via emotional temporal difference learning

Javad Abdi Behzad Moshiri Baher Abdulhai Ali Khaki Sedigh 《Neural computing & applications》2013,23(1):141-159

Information signal from real case and natural complex dynamical systems such as traffic flow are usually specified by irregular motions. Chaotic nonlinear dynamics approach is now the most powerful tool for scientists to deal with complexities in real cases, and neural networks and neuro-fuzzy models are widely used for their capabilities in nonlinear modeling of chaotic systems more than the traditional methods. As mentioned, the traffic flow conditions caused the forecasting values of traffic flow to lack robustness and accuracy. In this paper, the traffic flow forecasting is analyzed with emotional concepts and multi-agent systems (MASs) points of view as a new method in this field. The findings enabled the researchers to develop a newly object-oriented method of forecasting traffic flow. Its architecture is based on a temporal difference (TD) Q-learning with a neuro-fuzzy structure, which is the nonparametric approach. The performance of TD Q-learning is improved by emotional learning. The proposed method on the present conditions and the action of the system according to the criteria could forecast traffic signals so that the objectives are reached in minimum time. The ability of presented learning algorithm to prospect gains from future actions and obtain rewards from its past experiences allows emotional TD Q-learning algorithm to improve its decisions for the best possible actions. In addition, to study in a more practical situation, the neuro-fuzzy behaviors could be modeled by MAS. The proposed method (intelligent/nonparametric approach) is compared by parametric approach, autoregressive integrated moving average (ARIMA) method, which is implemented by multi-layer perceptron neural networks and called ARIMANN. Here, the ARIMANN is updated by backpropagation and temporal difference backpropagation for the first time. The simulation results revealed that the studied forecaster could discover the optimal forecasting by means of the Q-learning algorithm. Difficult to handle through parametric and classic methods, the real traffic flow signals used for fitting the algorithms is obtained from a two-lane street I-494 in Minnesota City. 相似文献

14.

Recursive least-squares temporal difference learning for adaptive traffic signal control at intersection

Yin Biao Dridi Mahjoub Moudni Abdellah El 《Neural computing & applications》2017,31(2):1013-1028

This paper presents a new method to solve the scheduling problem of adaptive traffic signal control at intersection. The method involves recursive least-squares temporal difference (RLS-TD(λ)) learning that is integrated into approximate dynamic programming. The learning mechanism of RLS-TD(λ) is to make an adaptation of linear function approximation by updating its parameters based on environmental feedback. This study investigates the method implementation after modeling a traffic dynamic system at intersection in discrete time. In the model, different traffic control schemes regarding signal phase sequence are considered, especially the defined adaptive phase sequence (APS). By simulating traffic scenarios, RLS-TD(λ) is superior to TD(λ) for updating functional parameters in the approximation, and APS outperforms other conventional control schemes on reducing traffic delay. By comparing with other traffic signal control algorithms, the proposed algorithm yields satisfying results in terms of traffic delay and computation time.

相似文献

15.

Hierarchical control of traffic signals using Q-learning with tile coding 总被引：1，自引：1，他引：0

Monireh Abdoos Nasser Mozayani Ana L. C. Bazzan 《Applied Intelligence》2014,40(2):201-213

Multi-agent systems are rapidly growing as powerful tools for Intelligent Transportation Systems (ITS). It is desirable that traffic signals control, as a part of ITS, is performed in a distributed model. Therefore agent-based technologies can be efficiently used for traffic signals control. For traffic networks which are composed of multiple intersections, distributed control achieves better results in comparison to centralized methods. Hierarchical structures are useful to decompose the network into multiple sub-networks and provide a mechanism for distributed control of the traffic signals. In this paper, a two-level hierarchical control of traffic signals based on Q-learning is presented. Traffic signal controllers, located at intersections, can be seen as autonomous agents in the first level (at the bottom of the hierarchy) which use Q-learning to learn a control policy. The network is divided into some regions where an agent is assigned to control each region at the second level (top of the hierarchy). Due to the combinational explosion in the number of states and actions, i.e. features, the use of Q-learning is impractical. Therefore, in the top level, tile coding is used as a linear function approximation method. A network composed of 9 intersections arranged in a 3×3 grid is used for the simulation. Experimental results show that the proposed hierarchical control improves the Q-learning efficiency of the bottom level agents. The impact of the parameters used in tile coding is also analyzed. 相似文献

16.

基于迭代学习与模型预测控制的交通信号混合控制方法

闫飞李浦续欣莹《控制理论与应用》2021,38(3):339-348

针对基于迭代学习控制的交通信号控制方法对于路网中存在的非重复性实时干扰不能进行有效处理的问题,本文在基于迭代学习控制的交通信号控制方法基础上,结合模型预测控制滚动优化和实时校正的特点,提出了一种基于迭代学习与模型预测控制的交通信号混合控制方法.该方法在有效利用交通流周期性特征改善路网交通状况的同时,可借助模型预测控制的... 相似文献

17.

Real-time traffic signal learning control using BPNN based on predictions of the probabilistic distribution of standing vehicles

Chengyou Cui Jisun Shin Heehyol Lee 《Artificial Life and Robotics》2010,15(1):58-61

In this article, a new method to predict the probabilistic distribution of a traffic jam at crossroads and a traffic signal learning control system are proposed. First, a dynamic Bayesian network is used to build a forecasting model to predict the probabilistic distribution of vehicles in a traffic jam during each period of the traffic signals. An adjusting algorithm for traffic signal control is applied to maintain the probability of a lower limit and a ceiling of standing vehicles to get the desired probabilistic distribution of standing vehicles. In order to achieve real-time control, a learning control system based on a back-propagation neural network is used. Finally, the effectiveness of the new traffic signal control system using actual traffic data will be shown. 相似文献

18.

Type-2 fuzzy logic based urban traffic management

P.G. Balaji D. Srinivasan 《Engineering Applications of Artificial Intelligence》2011,24(1):12-22

This paper presents a multi-agent system based on type-2 fuzzy decision module for traffic signal control in a complex urban road network. The distributed agent architecture using type-2 fuzzy set based controller was designed for optimizing green time in a traffic signal to reduce the total delay experienced by vehicles. A section of the Central Business District of Singapore simulated using PARAMICS software was used as a test bed for validating the proposed agent architecture for the signal control. The performance of the proposed multi-agent controller was compared with a hybrid neural network based hierarchical multi-agent system (HMS) controller and real-time adaptive traffic controller (GLIDE) currently used in Singapore. The performance metrics used for evaluation were total mean delay experienced by the vehicles to travel from source to destination and the current mean speed of vehicles inside the road network. The proposed multi-agent signal control was found to produce a significant improvement in the traffic conditions of the road network reducing the total travel time experienced by vehicles simulated under dual and multiple peak traffic scenarios. 相似文献