首页 | 本学科首页   官方微博 | 高级检索  
     

基于强化学习的频控阵-多输入多输出雷达发射功率分配方法
引用本文:丁梓航,谢军伟,齐铖.基于强化学习的频控阵-多输入多输出雷达发射功率分配方法[J].电子与信息学报,2023,45(2):550-557.
作者姓名:丁梓航  谢军伟  齐铖
作者单位:空军工程大学防空反导学院 西安 710051
摘    要:当前电磁环境日益复杂多变,新式干扰手段层出不穷,对雷达系统带来了极大的挑战和威胁。该文引入频谱干扰模型并提出了一种在频控阵-多输入多输出(FDA-MIMO)雷达与干扰机动态博弈框架下基于强化学习(RL)的发射功率分配优化方法,使雷达系统能够获得最大的信干噪比(SINR)。在此基础上,构造了频谱干扰模型。其次,雷达和干扰机之间存在一种Stackelberg博弈关系,且将雷达作为领导者,干扰机作为跟随者,建立动态博弈框架下的发射功率分配优化模型。采用深度确定性策略梯度(DDPG)算法,结合功率约束设计了奖赏函数,对雷达发射功率进行实时分配来获得最大的输出SINR。最后,仿真结果表明,在雷达与干扰机博弈的框架下,所提优化算法能够有效地对雷达发射功率进行优化,使雷达具备较好的抗干扰性能。

关 键 词:频控阵    强化学习    博弈论    功率分配
收稿时间:2021-12-22

Transmit Power Allocation Method of Frequency Diverse Array-Multi Input and Multi Output Radar Based on Reinforcement Learning
DING Zihang,XIE Junwei,QI Cheng.Transmit Power Allocation Method of Frequency Diverse Array-Multi Input and Multi Output Radar Based on Reinforcement Learning[J].Journal of Electronics & Information Technology,2023,45(2):550-557.
Authors:DING Zihang  XIE Junwei  QI Cheng
Affiliation:Air and Missile Defense College, Aire Force Engineering University, Xi’an 710051, China
Abstract:In recent years, the electromagnetic environment has been becoming increasingly complex and changeable, and new jamming methods emerge one after another, which brings great challenges and threats to the radar system. In this paper, the spectrum interference model is introduced and a transmit power allocation optimization method based on Reinforcement Learning (RL) under the dynamic game framework of Frequency Diverse Array Multi Input and Multi Output (FDA-MIMO) radar and the spectrum interference is proposed, so that the radar system can obtain the maximum output Signal-to-Interference plus Noise Ratio (SINR). Firstly, the mathematical model of FDA-MIMO radar is established, and on this basis, the spectrum interference model is constructed. Secondly, there is a Stackelberg game relationship between radar and jammer. Taking radar as the leader and jammer as the follower, the transmit power allocation optimization model under the framework of dynamic game is established. Using the Deep Deterministic Policy Gradient (DDPG) algorithm and power constraints, a reward function is designed to allocate the radar transmit power in real time to obtain the maximum output SINR. Finally, the simulation results show that under the framework of the game between radar and interference, the proposed optimization algorithm can effectively optimize the radar transmit power and make the radar have better anti-jamming performance.
Keywords:
点击此处可从《电子与信息学报》浏览原始摘要信息
点击此处可从《电子与信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号