首页 | 本学科首页   官方微博 | 高级检索  
     

基于字符级CNN的调水工程巡检文本智能分类方法
引用本文:刘婷,张社荣,李志竑,关炜.基于字符级CNN的调水工程巡检文本智能分类方法[J].水力发电学报,2021,40(6):89-98.
作者姓名:刘婷  张社荣  李志竑  关炜
摘    要:日常安全巡检是维护长距离调水工程安全运行的重要手段。目前巡检采集的非结构化文本数据主要依靠人工进行安全等级评判,在工作效率和准确率方面存在明显不足。本研究基于自然语言处理技术,提出了一种面向字符层面的卷积神经网络的巡检安全文本智能分类方法。该方法通过引入预训练的单个字符向量改进卷积神经网络的输入层,使得分类模型直接从原始文本中提取特征信息,不仅避免了传统分类方法对专业词库的依赖,而且不易受文本中出现的口语化表达和错别字的影响。以国内南水北调工程的巡检文本为案例,通过与多种深度学习算法进行全面比较,对比验证了所提方法的有效性和优越性。结果表明,字符级的分类方法明显优于传统基于词的分类方法,且卷积神经网络在巡检文本分类方面明显优于其他深度学习网络。该方法具有较高的分类准确率,以此为调水工程安全维护提供新的智能化手段。

关 键 词:调水工程  文本分类  字符向量化  卷积神经网络(CNN)  自然语言处理  

Intelligent text classification method for water diversion project inspection based on character level CNN
LIU Ting,ZHANG Sherong,LI Zhihong,GUAN Wei.Intelligent text classification method for water diversion project inspection based on character level CNN[J].Journal of Hydroelectric Engineering,2021,40(6):89-98.
Authors:LIU Ting  ZHANG Sherong  LI Zhihong  GUAN Wei
Abstract:Daily safety inspection is an important means to maintain the safe operation of long-distance water diversion projects. At present, unstructured text data collected from patrol inspection mainly rely on manual safety level evaluation, which has obvious deficiencies in work efficiency and accuracy. Based on natural language processing technologies, this paper describes an intelligent text classification method of character oriented convolutional neural network (CNN). This method improves the CNN input layer by introducing a pre-trained single character vector, allowing the classification model to extract feature information directly from the text; it not just avoids dependency of traditional classification methods on the professional lexicon, but its results are not easily affected by the colloquial expressions and typographical errors in the text. Taking the inspection text of a domestic water diversion project as a test case, its effectiveness and superiority are verified through comprehensive comparison with several deep learning algorithms. Results show that the character level classification is much better than the traditional words based method, and CNN is significantly better than other deep learning networks in classification of the patrol inspection texts. Our method provides a new intelligent means with high classification efficiency and accuracy for the safety maintenance of water diversion projects.
Keywords:water diversion project  text classification  character vectorization  convolution neural network (CNN)  natural language processing    
本文献已被 CNKI 等数据库收录!
点击此处可从《水力发电学报》浏览原始摘要信息
点击此处可从《水力发电学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号