首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度强化学习的无人机辅助物联网多目标优化
引用本文:徐钰龙,李 君,李正权,胡 静,张 圣,王子威.基于深度强化学习的无人机辅助物联网多目标优化[J].国外电子测量技术,2024,43(5):26-35.
作者姓名:徐钰龙  李 君  李正权  胡 静  张 圣  王子威
作者单位:1.南京信息工程大学电子与信息工程学院;1.南京信息工程大学电子与信息工程学院,2.无锡学院;3.北京邮电大学网络与交换技术国家重点实验室
基金项目:网络与交换技术全国重点实验室(北京邮电大学)开放课题项目(SKLNST-2023-1-13) 资助
摘    要:无人机辅助无线供电物联网是一种创新的网络架构,利用无人机作为能量传输中介,能够解决物联网设备电力供应 的限制和局限性。针对无人机辅助无线供电物联网网络中多目标控制策略学习的问题,提出了一种基于深度强化学习的多 目标双延迟深度确定性策略梯度(MOTD3) 算法,旨在满足偏航角、飞行速度以及发射功率约束条件下,实现总数据速率、总 收获能量最大化以及能耗和悬停时间最小化的多目标联合优化,同时因需求动态变化无人机进行在线路径规划。仿真结果 表明,该算法在保证良好的收敛情况和稳定性前提下,较其他算法在总数据速率、总收获能量、能耗与悬停时间方面分别提高 14.7%、10.6%、6.1%和10.3%,且具有较强泛化能力,可适用于实际中不同通信场景。

关 键 词:物联网  无人机  深度强化学习  多目标优化  路径规划

Multi-objective optimization of unmanned aerial vehicle assisted internet of things based on deep reinforcement learning
Xu Yulong,Li Jun,Li Zhengquan,Hu Jing,Zhang Sheng,Wang Ziwei.Multi-objective optimization of unmanned aerial vehicle assisted internet of things based on deep reinforcement learning[J].Foreign Electronic Measurement Technology,2024,43(5):26-35.
Authors:Xu Yulong  Li Jun  Li Zhengquan  Hu Jing  Zhang Sheng  Wang Ziwei
Abstract:The unmanned aerial vehicle(UAV)-assisted wireless power supply for the internet of things(IoT)is an innovative network architecture where UAVs serve as energy transmission intermediaries,effectively addressing the limitations and constraints of power supply for loT devices.In addressing the challenge of multi-objective control policy learning in UAV-assisted wireless power supply for the IoT,this study proposes a multi-objective twin-delay deep deterministic policy gradient(MOTD3)algorithm based on deep reinforcement learning.The MOTD3 algorithm aims to achieve joint optimization of multiple objectives,including maximizing the total data rate and total harvested energy, while minimizing energy consumption and hover time,under constraints such as yaw angle,flight speed,and transmission power.Additionally,it adapts UAVs to dynamic demand changes through online path planning.Simulation results demonstrate that the proposed algorithm can improve the total data rate,total harvest energy,energy consumption and hover time by 14.7%,10.6%,6.1%and 10.3%respectively compared with other algorithms,and has strong generalization ability,which can be applied to different communication scenarios in practice.
Keywords:internet    of    things(IoT)  unmanned    aerial    vehicle(UAV)  deep    reinforcement    learning(DRL)  multi    objective optimization  trajectory        optimization
点击此处可从《国外电子测量技术》浏览原始摘要信息
点击此处可从《国外电子测量技术》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号