基于强化学习的动态频谱分配研究 Dynamic spectrum allocation research based on reinforcement learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于强化学习的动态频谱分配研究

引用本文：	杜江,刘毅.基于强化学习的动态频谱分配研究[J].数字通信,2012,39(4):34-38.

作者姓名：	杜江刘毅

作者单位：	重庆邮电大学信息安全技术工程研究中心,重庆,400065

摘要：	首先介绍了认知无线电技术产生的背景,以及强化学习的发展和应用于认知领域的优势;接着对强化学习的基本原理及其2个常见的模型Q-Learning和POMDP作了介绍,并对其模型定义、思想、所要描述的问题和使用的场景都做了较详细的阐述;然后针对这个方向最近几年的顶级会议和期刊论文,分析了其主要内容;通过最近几年的学术、会议论文中所述的研究现状及成果,说明强化学习的主要特点是能够准确、快速学习到最优策略,能够模拟真实环境,自适应性强,提高频谱感知、分配效率,从而最大化系统吞吐量,这些优势充分证明了强化学习将是认知领域里一种很有前景的技术。
关键词：	认知无线电动态频谱分配强化学习 Q学习部分感知马尔科夫决策过程
Dynamic spectrum allocation research based on reinforcement learning

DU Jiang,LIU Yi.Dynamic spectrum allocation research based on reinforcement learning[J].Digital Communication,2012,39(4):34-38.

Authors:	DU Jiang LIU Yi

Affiliation:	Research Center of Information Security Technology Engineering, Chongqing University of Posts and Telecommunication, Chongqing 400065,P.R.China

Abstract:	This essay briefly sketches the background and characteristic of cognitive radio and reinforcement learning technology. It reviews the main research direction of the field of cognitive radio for dynamic spectrum allocation (DSA), including the introduction of the two common models in Reinforcement Learning: Q-learning and partially observable markov decision process (POMDP). And we analyze the research contents and developments for DSA on the basis of the two models in recent years. Finally, we deduce a conclusion and forecast the development trend of this field in the future.

Keywords:	cognitive radio dynamic spectrum allocation reinforcement Learning Q-Learning partial perception POMDP
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《数字通信》浏览原始摘要信息
	点击此处可从《数字通信》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏