首页 | 本学科首页   官方微博 | 高级检索  
     

基于BiLSTM-CRF的体育新闻主题词抽取方法
引用本文:江逸琪,赵彤洲,柴 悦,高佩东. 基于BiLSTM-CRF的体育新闻主题词抽取方法[J]. 武汉工程大学学报, 2020, 42(1): 102-107. DOI: 10.19843/j.cnki.CN42-1779/TQ.201908018
作者姓名:江逸琪  赵彤洲  柴 悦  高佩东
作者单位:武汉工程大学计算机科学与工程学院,湖北 武汉 430205
摘    要:针对典型的循环神经网络方法在抽取主题词时因缺少上下文相关的句子级信息而导致识别准确率较低的问题,提出了一种基于双向长短期记忆网络条件随机场(BiLSTM-CRF)模型联合TextRank的主题词抽取方法。首先,利用TextRank对新闻文本进行主题句抽取,再使用双向长短期记忆(BiLSTM)模型获取文本的前后特征,最后使用条件随机场(CRF)完成句子级序列标注,得到主题词。在多组体育类新闻数据集上进行实验,该方法较对照组BiLSTM方法F1值提高约0.8%~5.1%,且用时更短。因此,改进的BiLSTM-CRF方法可显著提升主题词的抽取准确率和效率。

关 键 词:体育新闻  主题词抽取  TextRank  BiLSTM-CRF

Topic Word Extraction Based on BiLSTM-CRF for Sport News
JIANG Yiqi,ZHAO Tongzhou,CHAI Yue,GAO Peidong. Topic Word Extraction Based on BiLSTM-CRF for Sport News[J]. Journal of Wuhan Institute of Chemical Technology, 2020, 42(1): 102-107. DOI: 10.19843/j.cnki.CN42-1779/TQ.201908018
Authors:JIANG Yiqi  ZHAO Tongzhou  CHAI Yue  GAO Peidong
Affiliation:School of Computer Science and Engineering, Wuhan Institute of Technology, Wuhan 430205, China
Abstract:To solve the problem of low recognition accuracy caused by the lack of text context information in typical recurrent neural network for extracting topic words, we proposed a novel method for extracting topic words based on Bidirectional Long Short-Term Memory (BiLSTM) network with Conditional Random Field. Firstly, the topic sentences were extracted from news texts by the TextRank model. Then, the forward and backward characters of texts were obtained by BiLSTM network. Finally, the topic words were sequence-tagged in sentence-level by a Conditional Random Field layer. Experiments were performed on multiple sports news datasets. Compared with the control group of BiLSTM method, the F1 value increases by 0.8%-5.1%. The experimental results show that our method can significantly improve the accuracy and efficiency of topic word extraction.
Keywords:sport news  topic word extraction  TextRank  BiLSTM-CRF
本文献已被 CNKI 等数据库收录!
点击此处可从《武汉工程大学学报》浏览原始摘要信息
点击此处可从《武汉工程大学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号