基于Q学习的DDoS攻防博弈模型研究 Research on DDoS Attack-defense Game Model Based on Q-learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于Q学习的DDoS攻防博弈模型研究

引用本文：	史云放,武东英,刘胜利,高翔.基于Q学习的DDoS攻防博弈模型研究[J].计算机科学,2014,41(11):203-207,226.

作者姓名：	史云放武东英刘胜利高翔

作者单位：	数学工程与先进计算国家重点实验室郑州450002

基金项目：	本文受国家自然科学基金(61309007),郑州市科技创新团队项目(10CXTD150)资助

摘要：	新形势下的DDoS攻防博弈过程和以往不同,因此利用现有的方法无法有效地评估量化攻防双方的收益以及动态调整博弈策略以实现收益最大化。针对这一问题,设计了一种基于Q学习的DDoS攻防博弈模型,并在此基础上提出了模型算法。首先,通过网络熵评估量化方法计算攻防双方收益;其次,利用矩阵博弈研究单个DDoS攻击阶段的攻防博弈过程;最后,将Q学习引入博弈过程,提出了模型算法,用以根据学习效果动态调整攻防策略从而实现收益最大化。实验结果表明,采用模型算法的防御方能够获得更高的收益,从而证明了算法的可用性和有效性。
关键词：	DDoS攻防矩阵博弈 Q学习网络熵纳什均衡
收稿时间：	2013/11/22 0:00:00
修稿时间：	2014/2/24 0:00:00
Research on DDoS Attack-defense Game Model Based on Q-learning

SHI Yun-fang,WU Dong-ying,LIU Sheng-li and GAO Xiang.Research on DDoS Attack-defense Game Model Based on Q-learning[J].Computer Science,2014,41(11):203-207,226.

Authors:	SHI Yun-fang WU Dong-ying LIU Sheng-li and GAO Xiang

Affiliation:	State Key Laboratory of Mathematical Engineering and Advanced Computing,Zhengzhou 450002,China;State Key Laboratory of Mathematical Engineering and Advanced Computing,Zhengzhou 450002,China;State Key Laboratory of Mathematical Engineering and Advanced Computing,Zhengzhou 450002,China;State Key Laboratory of Mathematical Engineering and Advanced Computing,Zhengzhou 450002,China

Abstract:	The process of DDoS attack-defense game in new situation is different now,so the payoff value cannot be quantified effectively and the game strategy cannot be adjusted dynamically to maximize the payoff using existing methods.In response to this problem,a DDoS attack-defense game model based on Q-learning was designed,and at the same time an algorithm was proposed on the basis of the model.Firstly,the payoff of the attacker and defender was calculated with the network entropy quantitative assessment method.Secondly,the single DDoS attack stage was studied using matrix game method.Finally,the model algorithm was proposed by introducing the Q-learning method into the game process,with which the strategies are adjusted dynamically according to the learning outcomes to maximize the payoff.The result of verification testing shows that the defender can achieve a higher payoff when adopting the model algorithm,thus the algorithm turns out to be practicable and effective.

Keywords:	DDoS attack-defense Matrix game Q-learning Network entropy Nash equilibrium
本文献已被 CNKI 万方数据等数据库收录！
	点击此处可从《计算机科学》浏览原始摘要信息
	点击此处可从《计算机科学》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏