首页 | 本学科首页   官方微博 | 高级检索  
     

基于鲁棒Restless Bandits模型的多水下自主航行器任务分配策略
引用本文:李鑫滨,章寿涛,闫磊,韩松.基于鲁棒Restless Bandits模型的多水下自主航行器任务分配策略[J].计算机应用,2019,39(10):2795-2801.
作者姓名:李鑫滨  章寿涛  闫磊  韩松
作者单位:工业计算机控制工程河北省重点实验室(燕山大学),河北秦皇岛,066004;工业计算机控制工程河北省重点实验室(燕山大学),河北秦皇岛,066004;工业计算机控制工程河北省重点实验室(燕山大学),河北秦皇岛,066004;工业计算机控制工程河北省重点实验室(燕山大学),河北秦皇岛,066004
基金项目:国家自然科学基金资助项目(61873224,61571387)。
摘    要:针对水下监测网络中多自主航行器(AUV)协同信息采集任务分配问题进行了研究。首先,为了同时考虑系统中目标传感器的节点状态与声学信道状态对AUV任务分配问题的影响,构建了水声监测网络系统的综合模型;其次,针对水下存在的多未知干扰因素并考虑了模型产生不精确的情况,基于强化学习理论将多AUV任务分配系统建模为鲁棒无休止赌博机问题(RBP)。最后,提出鲁棒Whittle算法求解所建立的RBP,从而求解得出多AUV的任务分配策略。仿真结果表明,在干扰环境下与未考虑干扰因素的分配策略相比,在系统分别选择1、2、3个目标时,鲁棒AUV分配策略对应的系统累计回报值参数的性能分别提升了5.5%、12.3%和9.6%,验证了所提方法的有效性。

关 键 词:水声监测网络  水下自主航行器任务分配  鲁棒控制  不确定模型  无休止赌博机问题
收稿时间:2019-03-04
修稿时间:2019-05-18

Multiple autonomous underwater vehicle task allocation policy based on robust Restless Bandit model
LI Xinbin,ZHANG Shoutao,YAN Lei,HAN Song.Multiple autonomous underwater vehicle task allocation policy based on robust Restless Bandit model[J].journal of Computer Applications,2019,39(10):2795-2801.
Authors:LI Xinbin  ZHANG Shoutao  YAN Lei  HAN Song
Affiliation:Hebei Key Laboratory of Industrial Computer Control Engineering(Yanshan University), Qinhuangdao Hebei 066004, China
Abstract:The problem of multiple Autonomous Underwater Vehicles (AUV) collaborative task allocation for information acquisition in the underwater detection network was researched. Firstly, a comprehensive model of underwater acoustic monitoring network system was constructed considering the influence of network system sensor nodes status and communication channel status synthetically. Secondly, because of the multi-interference factors under water, with the inaccuracy of the model generation considered, and the multi-AUV task allocation system was modeled as a robust Restless Bandits Problem (RBP) based on the theory of reinforce learning. Lastly, the robust Whittle algorithm was proposed to solve the RBP problem to get the task allocation policy of multi-AUV. Simulation results show that when the system selected 1, 2 and 3 targets, the system cumulative return performance of the robust allocation policy improves by 5.5%, 12.3% and 9.6% respectively compared with that of the allocation strategy without interference factors considered, proving the effectiveness of the proposed approaches.
Keywords:underwater acoustic monitoring network  Autonomous Underwater Vehicles (AUV) task allocation  robust control  inaccuracy model  Restless Bandit Problem (RBP)  
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号