机器类通信中集中式与分布式Q学习的资源分配算法研究 Research on resource allocation algorithm of centralized and distributed Q-learning in machine communication期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

机器类通信中集中式与分布式Q学习的资源分配算法研究

引用本文：	余云河,孙君. 机器类通信中集中式与分布式Q学习的资源分配算法研究[J]. 电信科学, 2021, 37(11): 41-50. DOI: 10.11959/j.issn.1000-0801.2021244

作者姓名：	余云河孙君

作者单位：	南京邮电大学通信与信息工程学院,江苏南京 210023

基金项目：	国家自然科学基金资助项目(61771255);中国科学院重点实验室开放课题(20190904)

摘要：	针对海量机器类通信（massive machine type communication，mMTC）场景，以最大化系统吞吐量为目标，且在保证部分机器类通信设备（machine type communication device，MTCD）的服务质量（quality of service，QoS）要求前提下，提出两种基于Q学习的资源分配算法：集中式Q学习算法（team-Q）和分布式Q学习算法（dis-Q）。首先基于余弦相似度（cosine similarity，CS）聚类算法，考虑到MTCD地理位置和多级别QoS要求，构造代表MTCD和数据聚合器（data aggregator，DA）的多维向量，根据向量间CS值完成分组。然后分别利用team-Q学习算法和dis-Q学习算法为MTCD分配资源块（resource block，RB）和功率。吞吐量性能上，team-Q 和 dis-Q 算法相较于动态资源分配算法、贪婪算法分别平均提高了 16%、23%；复杂度性能上，dis-Q算法仅为team-Q算法的25%及以下，收敛速度则提高了近40%。
关键词：	资源分配集中式Q学习分布式Q学习余弦相似度多维向量
Research on resource allocation algorithm of centralized and distributed Q-learning in machine communication

Yunhe YU,Jun SUN. Research on resource allocation algorithm of centralized and distributed Q-learning in machine communication[J]. Telecommunications Science, 2021, 37(11): 41-50. DOI: 10.11959/j.issn.1000-0801.2021244

Authors:	Yunhe YU Jun SUN

Affiliation:	College of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210023, China

Abstract:	Under the premise of ensuring partial machine type communication device (MTCD)’s quality of service (QoS) requirements, the resource allocation problem was studied with the goal of maximizing system throughput in the massive machine type communication (mMTC) scenario.Two resource allocation algorithms based on Q-learning were proposed: centralized Q-learning algorithm (team-Q) and distributed Q-learning algorithm (dis-Q).Firstly, taking into account MTCD’s geographic location and multi-level QoS requirements, a clustering algorithm based on cosine similarity (CS) was designed.In the clustering algorithm, multi-dimensional vectors that represent MTCD and data aggregator (DA) were constructed, and MTCDs can be grouped according to the CS value between multi-dimensional vectors.Then in the MTC network, the team-Q learning algorithm and dis-Q learning algorithm were used to allocate resource blocks and power for the MTCD.In terms of throughput performance, team-Q and dis-Q algorithms have an average increase of 16% and 23% compared to the dynamic resource allocation algorithm and the greedy algorithm, respectively.In terms of complexity performance, the dis-Q algorithm is only 25% of team-Q algorithm and even below, the convergence speed is increased by nearly 40%.

Keywords:	resource allocation centralized Q-learning distributed Q-learning consine similarity multi-dimensional vector
本文献已被万方数据等数据库收录！
	点击此处可从《电信科学》浏览原始摘要信息
	点击此处可从《电信科学》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏