首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于循环神经网络的电网客服语音文本实体识别算法
引用本文:贾全烨,张强,宋博川.一种基于循环神经网络的电网客服语音文本实体识别算法[J].供用电,2020(6):13-20.
作者姓名:贾全烨  张强  宋博川
作者单位:全球能源互联网研究院有限公司
基金项目:国家电网有限公司总部科技项目“电力营业厅智能机器人应用系统关键技术”(5210EG20000G)。
摘    要:针对电力领域语音转写文本质量差,不能很好解决电网领域命名实体识别问题,以电网信息通信(information and communications technology,ICT)系统语音转写文本数据为研究对象,构建了一种基于双向长短期记忆(bi-directional long short-term memory,BiLSTM)神经网络融合条件随机场(conditional random field,CRF)面向电力文本特征的实体识别算法。通过与循环神经网络(recurrent neural network,RNN)等神经网络算法的对比验证:BiLSTM-CRF在电网ICT领域实体识别准确率达79%,F1值达80%,优于LSTM(long short-term memory)和其他RNN算法,并能较好地识别转写错误实体。该算法有效提升了领域语音转写文本的实体识别准确率,同时降低了领域语音识别技术成本,为电网客服领域信息检索、智能问答、个性化推荐等自然语言处理应用提供了高质量非结构化样本数据。

关 键 词:BiLSTM  循环神经网络  条件随机场  ICT客服  实体识别

A Text Entity Recognition Algorithm Based on Recurrent Neural Network for Customer Service Voice of State Grid
JIA Quanye,ZHANG Qiang,SONG Bochuan.A Text Entity Recognition Algorithm Based on Recurrent Neural Network for Customer Service Voice of State Grid[J].Distribution & Utilization,2020(6):13-20.
Authors:JIA Quanye  ZHANG Qiang  SONG Bochuan
Affiliation:(Global Energy Interconnection Research Institute Co.,Ltd.,Beijing 102209,China)
Abstract:In view of the poor quality of voice transcribed text in the field of power grid,which can not solve the problem of named entity recognition,this paper takes information and communications technology(ICT)voice transcribed text data in power grid as the research object.A domain voice transcribed text entity recognition algorithm based on bi-directional long shortterm memory(BiLSTM)is constructed,combining with conditional random field(CRF),which is verified by comparison with recurrent neural network(RNN)and other neural network algorithms.Results:In the field of ICT,the accuracy of entity recognition is 80%,and the F1 value is 79.56%,better than long short-term memory(LSTM)and other RNN algorithms.Conclusion:It effectively improves the accuracy of entity recognition of domain voice transcribed text,reduces the cost of domain voice recognition technology,and provides a high-quality unstructured sample base for natural language processing tasks such as ICT customer service domain information retrieval,intelligent Q&A,personalized recommendation,etc.
Keywords:BiLSTM  RNN  CRF  ICT customer service  entity recognition
本文献已被 CNKI 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号