首页 | 本学科首页   官方微博 | 高级检索  
     

基于再励学习的主动队列管理算法
引用本文:张雁冰,杭大明,马正新,曹志刚.基于再励学习的主动队列管理算法[J].软件学报,2004,15(7):1090-1098.
作者姓名:张雁冰  杭大明  马正新  曹志刚
作者单位:清华大学,电子工程系,微波与数字通信国家重点实验室,北京,100084
基金项目:Supported bythe National High-Tech Research and Development Plan of Chinaunder Grant No.2001AA121062(国家高技术研究发展计划(863))
摘    要:从最优决策的角度出发,将人工智能中的再励学习方法引入主动队列管理的研究中,提出了一种基于再励学习的主动队列管理算法RLGD(reinforcement learning gradient-descent).RLGD以速率匹配和队列稳定为优化目标,根据网络状态自适应地调节更新步长,使得队列长度能够很快收敛到目标值,并且抖动很小.此外,RLGD不需要知道源端的速率调整算法,因而具有很好的可扩展性.通过不同网络环境下的仿真显示,RLGD与REM,PI等AQM算法相比,具有更好的性能和鲁棒性.

关 键 词:拥塞控制  主动队列管理  再励学习
文章编号:1000-9825/2004/15(07)1090
收稿时间:2003/7/23 0:00:00
修稿时间:2003年7月23日

A Robust Active Queue Management Algorithm Based on Reinforcement Learning
ZHANG Yan-Bing,HANG Da-Ming,MA Zheng-Xin and CAO Zhi-Gang.A Robust Active Queue Management Algorithm Based on Reinforcement Learning[J].Journal of Software,2004,15(7):1090-1098.
Authors:ZHANG Yan-Bing  HANG Da-Ming  MA Zheng-Xin and CAO Zhi-Gang
Abstract:From the viewpoint of decision theory, AQM (active queue management) can be considered as an optimal decision problem. In this paper, a new AQM scheme, Reinforcement Learning Gradient-Descent (RLGD), is described based on the optimal decision theory of reinforcement learning. Aiming to maximize the throughput and stabilize the queue length, RLGD adjusts the update step adaptively, without the demand of knowing the rate adjustment scheme of the source sender. Simulation demonstrates that RLGD can lead to the convergence of the queue length to the desired value quickly and maintain the oscillation small. The results also show that the RLGD scheme is very robust to disturbance under various network conditions and outperforms the traditional REM and PI controllers significantly.
Keywords:congestion control  active queue management  reinforcement learning
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号