首页 | 本学科首页   官方微博 | 高级检索  
     

无人机基站的飞行路线在线优化设计
引用本文:张广驰, 严雨琳, 崔苗, 陈伟, 张景. 无人机基站的飞行路线在线优化设计[J]. 电子与信息学报, 2021, 43(12): 3605-3611. doi: 10.11999/JEIT200525
作者姓名:张广驰  严雨琳  崔苗  陈伟  张景
作者单位:1.广东工业大学信息工程学院 广州 510006;;2.广东省环境地质勘查院 广州 510080;;3.中国电子科学研究院 北京 100043
基金项目:广东省科技计划(2017B090909006, 2019B010119001, 2020A050515010, 2021A0505030015),广东特支计划(2019TQ05X409)
摘    要:针对离线的无人机(UAV)基站飞行路线设计无法满足随机的、动态的地面用户通信请求难题,该文研究了飞行路线在线优化设计算法。考虑单个无人机空中基站为两个地面用户提供无线通信服务,通过在线实时优化无人机的飞行路线实现最小化与地面用户的平均通信时延。首先,由于系统的无人机的状态和动作是连续的,将问题转化成一个马尔可夫决策过程(MDP);然后,把单次通信时延引入到动作价值函数中;最后分别采用强化学习中蒙特卡罗和Q-Learning算法来实现无人机的飞行路线在线优化。仿真结果表明,所提出的在线优化的平均时延性能优于“固定位置”和“贪婪算法”的时延计算结果。

关 键 词:无人机通信   飞行路线在线优化   平均时延最小化   强化学习
收稿时间:2020-06-29
修稿时间:2021-06-07

Online Trajectory Optimization for the UAV-Mounted Base Stations
Guangchi ZHANG, Yulin YAN, Miao CUI, Wei CHEN, Jing ZHANG. Online Trajectory Optimization for the UAV-Mounted Base Stations[J]. Journal of Electronics & Information Technology, 2021, 43(12): 3605-3611. doi: 10.11999/JEIT200525
Authors:Guangchi ZHANG  Yulin YAN  Miao CUI  Wei CHEN  Jing ZHANG
Affiliation:1. School of Information Engineering, Guangdong University of Technology, Guangzhou 510006, China;;2. Institute of Environmental Geology Exploration of Guangdong Province, Guangzhou 510080, China;;3. China Academic of Electronics and Information Technology, Beijing 100043, China
Abstract:Considering dealing with the problem of random and dynamic communication requests of ground users in a UAV(Unmanned Aerial Vehicle) mounted base station communication system, which can not be tackled by an offline trajectory design scheme, an online trajectory optimization algorithm is proposed for the UAV-mounted base station. In the considered system, a single UAV is utilized as an aerial base station to provide wireless communication service to two ground users. The problem of minimizing the average communication delay of the ground users via optimizing the UAV’s trajectory is considered. First, it is shown that the problem can be casted as a Markov Decision Process (MDP), and then the delay of one single communication is introduced into the action value function. Finally, the Monte Carlo and Q-Learning algorithms from the reinforcement learning technology are respectively adopted to realize the online trajectory optimization. Simulation results show that the proposed algorithm outperforms the “fixed position” and “greedy algorithm” schemes.
Keywords:Unmanned Aerial Vehicle (UAV) communication  Online trajectory optimization  Average delay minimization  Reinforcement learning
本文献已被 万方数据 等数据库收录!
点击此处可从《电子与信息学报》浏览原始摘要信息
点击此处可从《电子与信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号