基于函数逼近的强化学习FANET路由优化算法 Optimized FANET Routing Algorithm with Reinforcement Learning Based on Function Approximation期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于函数逼近的强化学习FANET路由优化算法

引用本文：	谢勇盛,杨余旺,邱修林,王吟吟.基于函数逼近的强化学习FANET路由优化算法[J].计算机工程,2021,47(11):207-213.

作者姓名：	谢勇盛杨余旺邱修林王吟吟

作者单位：	南京理工大学计算机科学与工程学院, 南京 210094

基金项目：	江苏省重点研发计划（BE2018393）；苏州市重点产业技术创新项目（SYG201826）。

摘要：	针对高速移动状态下的飞行自组网路由协议链路维护困难问题，提出一种基于强化学习的自适应链路状态路由优化算法QLA-OLSR。借鉴强化学习中的Q学习算法，通过感知动态环境下节点邻居数量变化和业务负载程度，构建价值函数求解最优HELLO时隙，提高节点链路发现与维护能力。利用优化Kanerva编码算法的状态相似度机制，降低QLA-OLSR算法复杂度并增强稳定性。仿真结果表明，QLA-OLSR算法能有效提升网络吞吐量，减少路由维护开销，且具有自学习特性，适用于高动态环境下的飞行自组网。
关键词：	飞行自组网函数逼近 Q学习路由算法自适应HELLO时隙
收稿时间：	2020-09-27
修稿时间：	2020-11-16
Optimized FANET Routing Algorithm with Reinforcement Learning Based on Function Approximation

XIE Yongsheng,YANG Yuwang,QIU Xiulin,WANG Yinyin.Optimized FANET Routing Algorithm with Reinforcement Learning Based on Function Approximation[J].Computer Engineering,2021,47(11):207-213.

Authors:	XIE Yongsheng YANG Yuwang QIU Xiulin WANG Yinyin

Affiliation:	School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China

Abstract:	The high-speed movement of nodes in Flying Ad-Hoc Network(FANET) has caused difficulties in maintaining the links of the FANET routing protocol.To address the problem,an algorithm named QLA-OLSR is proposed based on Reinforcement Learning(RL) for adaptive optimization of link state routing.By sensing the changing number of the node neighbors and the service loads in the dynamic environment,the Q-learning algorithm in RL is used to construct a value function.On this basis,the optimal HELLO time slot is solved to improve the performance of the node in link detection and maintenance.Then the State Similarity Mechanism(SSM) of the improved Kanerva coding algorithm is used to reduce the complexity of the algorithm while increasing its stability. Simulation results show that the QLA-OLSR algorithm can significantly improve the network throughput,reduce the overhead of routine maintenance,and is capable of self-learning.It is suitable for FANET in a highly dynamic environment.

Keywords:	Flying Ad-Hoc Network(FANET) function approximation Q-learning routing algorithm adaptive HELLO time slot
本文献已被万方数据等数据库收录！
	点击此处可从《计算机工程》浏览原始摘要信息
	点击此处可从《计算机工程》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏