基于异步优势动作评价的RFID室内定位算法 RFID Indoor Positioning Algorithm Based on Asynchronous Advantage Actor-Critic期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于异步优势动作评价的RFID室内定位算法

引用本文：	李丽,郑嘉利,王哲,袁源,石静.基于异步优势动作评价的RFID室内定位算法[J].计算机科学,2020,47(2):233-238.

作者姓名：	李丽郑嘉利王哲袁源石静

作者单位：	广西大学计算机与电子信息学院南宁 530004;广西多媒体通信与网络技术重点实验室南宁 530004

摘要：	针对现有的RFID室内定位算法的精度容易受到环境因素影响的问题,提出了一种基于异步优势动作评价(Asynchronous Advantage Actor-critic,A3C)的RFID室内定位算法。该算法的主要步骤为:1)将RFID的信号强度RSSI值作为输入值,多个线程子动作网络并行交互采样学习,利用子评价网络评价动作值的优劣,使模型不断优化,找到最优信号强度RSSI值,并训练定位模型;子线程网络定期将网络参数异步更新到全局网络上,全局网络最后输出参考标签的具体位置,同时训练得到异步优势动作评价定位模型。2)在线定位阶段,当待测目标进入待测区域时,记录待测目标的信号强度RSSI值,将其输入异步优势动作评价定位模型中,子线程网络从全局网络中获取最新定位信息,对待测目标进行定位,最后输出目标的具体位置。实验数据表明,基于异步优势动作评价的RFID室内定位算法与传统的基于向量机(Support Vector Machines,SVM)定位、基于极限学习机(Extreme Learning Machine,ELM)定位、基于多层神经网络定位(Multi-Layer Perceptron,MLP)的RFID室内定位算法相比,定位平均误差分别下降了66.114%,50.316%,44.494%;定位稳定性分别平均提高了59.733%,53.083%,43.748%。实验结果表明,基于异步优势动作评价的RFID室内定位算法在处理大量室内定位目标时具有较好的定位性能。
关键词：	RFID RSSI 强化学习异步优势动作评价室内定位
RFID Indoor Positioning Algorithm Based on Asynchronous Advantage Actor-Critic

LI Li,ZHENG Jia-li,WANG Zhe,YUAN Yuan,SHI Jing.RFID Indoor Positioning Algorithm Based on Asynchronous Advantage Actor-Critic[J].Computer Science,2020,47(2):233-238.

Authors:	LI Li ZHENG Jia-li WANG Zhe YUAN Yuan SHI Jing

Affiliation:	(School of Computer,Electronics and Information,Guangxi University,Nanning 530004,China;Guangxi Key Laboratory of Multimedia Communications and Network Technology,Nanning 530004,China)

Abstract:	In view of the fact that the accuracy of existing RFID indoor positioning algorithm is easily affected by environment factors and the robustness is not strong,this paper proposed an RFID indoor positioning algorithm based on asynchronous advantage actor-critic(A3C).The main steps of the algorithm are as follows.Firstly,the RSSI value of RFID signal strength is used as the input value.The multi-thread sub-action network parallel interactive sampling learning,and the sub-evaluation network evaluates the advantage and disadvantage of the action value,so that the model is continuously optimized to find the best signal strength RSSI and trains the positioning model.The sub-thread network updates the network parameters to the global network on a regular basis,and the global network finally outputs the specific location of the reference tag,at the same time the asynchronous advantage actor-critic positioning model is trained.Secondly,in the online positioning stage,when the target to be tested enters the area to be tested,the signal strength RSSI value of the object to be tested is recorded and input into the asynchronous advantage actor-critic positioning model.The sub-thread network obtains the latest positioning information from the global network,locates the side target,and finally outputs the specific position of the target.RFID indoor positioning algorithm based on asynchronous advantage actor-critic was compared with the traditional RFID indoor positioning algorithm based on Support Vector Machines(SVM)positioning,Extreme Learning Machine(ELM)positioning,and Multi-Layer Perceptron positioning(MLP).Experiment results show that the mean positioning error of the proposed algorithm is respectively decreased by 66.114%,50.316%and 44.494%;the average positioning stability is respectively increased by 59.733%,53.083%and 43.748%.The experiment results show that the proposed RFID indoor positioning algorithm based on asynchronous advantage actor-critic has better positioning performance when dealing with a large number of indoor positioning targets.

Keywords:	RFID RSSI Reinforcement learning Asynchronous advantage actor-critic Indoor positioning
本文献已被维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏