首页 | 本学科首页   官方微博 | 高级检索  
     

基于多智能体深度强化学习的D2D通信资源联合分配方法
引用本文:邓炳光,徐成义,张泰,孙远欣,张蔺,裴二荣.基于多智能体深度强化学习的D2D通信资源联合分配方法[J].电子与信息学报,2023,45(4):1173-1182.
作者姓名:邓炳光  徐成义  张泰  孙远欣  张蔺  裴二荣
作者单位:1.重庆邮电大学通信与信息工程学院 重庆 4000652.国网四川省电力公司电力科学研究院 成都 6100933.重庆金美通信有限公司 重庆 4000354.电子科技大学通信抗干扰技术国家级重点实验室 成都 611731
基金项目:国家重大专项(2018zx0301016),国家自然科学基金项目(62071077),重庆成渝科技创新项目(KJCXZD2020026)
摘    要:设备对设备(D2D)通信作为一种短距离通信技术,能够极大地减轻蜂窝基站的负载压力和提高频谱利用率。然而将D2D直接部署在授权频段或者免授权频段必然导致与现有用户的严重干扰。当前联合部署在授权和免授权频段的D2D通信的资源分配通常被建模为混合整数非线性约束的组合优化问题,传统优化方法难以解决。针对这个挑战性问题,该文提出一种基于多智能体深度强化学习的D2D通信资源联合分配方法。在该算法中,将蜂窝网络中的每个D2D发射端作为智能体,智能体能够通过深度强化学习方法智能地选择接入免授权信道或者最优的授权信道并发射功率。通过选择使用免授权信道的D2D对(基于“先听后说”机制)向蜂窝基站的信息反馈,蜂窝基站能够在非协作的情况下获得WiFi网络吞吐量信息,使得算法能够在异构环境中执行并能够确保WiFi用户的QoS。与多智能体深度Q网络(MADQN)、多智能体Q学习(MAQL)和随机算法相比,所提算法在保证WiFi用户和蜂窝用户的QoS的情况下能够获得最大的吞吐量。

关 键 词:D2D通信  先听后说  免授权频段长期演进  资源分配  多智能体强化学习
收稿时间:2022-03-04

A Joint Resource Allocation Method of D2D Communication Resources Based on Multi-agent Deep Reinforcement Learning
DENG Bingguang,XU Chengyi,ZHANG Tai,SUN Yuanxin,ZHANG Lin,PEI Errong.A Joint Resource Allocation Method of D2D Communication Resources Based on Multi-agent Deep Reinforcement Learning[J].Journal of Electronics & Information Technology,2023,45(4):1173-1182.
Authors:DENG Bingguang  XU Chengyi  ZHANG Tai  SUN Yuanxin  ZHANG Lin  PEI Errong
Affiliation:1.Institute of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China2.Electric Power Research Institute of State Grid Sichuan Electric Power Company, Chengdu 610093, China3.Chongqing Jinmei Communication Co., Ltd, Chongqing 400035, China4.State Key Laboratory of Communication Anti-interference Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
Abstract:As a short-range communication technology, Device-to-Device (D2D) communication can greatly reduce the load pressure on cellular base stations and improve spectrum utilization. However, the direct deployment of D2D to licensed or unlicensed bands will inevitably lead to serious interference with existing users. At present, the resource allocation of D2D communication jointly deployed in licensed and unlicensed bands is usually modeled as a mixed-integer nonlinear constraint combinatorial optimization problem, which is difficult to solve by traditional optimization methods. To address this challenging problem, a multi-agent deep reinforcement learning based joint resource allocation D2D communication method is proposed. In this algorithm, each D2D transmitter in the cellular network acts as an agent, which can intelligently select access to the unlicensed channel or the optimal licensed channel and it transmits power through the deep reinforcement learning method. Through the feedback of D2D pairs that compete for the unlicensed channels based on the Listen Before Talk (LBT) mechanism, WiFi network throughput information can be obtained by cellular base station in a non-cooperative manner, so that the algorithm can be executed in a heterogeneous environment and QoS of WiFi users is guaranteed. Compared with Multi Agent Deep Q Network (MADQN), Multi Agent Q Learning (MAQL) and Random Baseline algorithms, the proposed algorithm can achieve the maximum throughput while the QoS is guaranteed for both WiFi users and cellular users.
Keywords:
点击此处可从《电子与信息学报》浏览原始摘要信息
点击此处可从《电子与信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号