首页 | 本学科首页   官方微博 | 高级检索  
     

基于Siamese LSTM的中文多文档自动文摘模型
引用本文:龚永罡,王嘉欣,廉小亲,裴晨晨.基于Siamese LSTM的中文多文档自动文摘模型[J].计算机应用与软件,2021,38(3):287-290,326.
作者姓名:龚永罡  王嘉欣  廉小亲  裴晨晨
作者单位:北京工商大学计算机与信息工程学院 北京 100048;北京工商大学计算机与信息工程学院 北京 100048;北京工商大学计算机与信息工程学院 北京 100048;北京工商大学计算机与信息工程学院 北京 100048
摘    要:在文本信息数量迅速增长的环境下,为提升阅读效率,提出一种基于深度学习的多文档自动文本摘要模型。在传统文摘模型的基础上将Siamese LSTM深度学习网络应用到文本相似度计算中,计算曼哈顿距离来表征文本相似度,并采用去除停用词的方法改进该网络模型以提升计算效率。实验结果表明,使用Siamese LSTM与传统余弦相似度等方法相比,生成的文摘在语义方面更贴近主题,质量更高,整个文摘系统的工作效率也显著提升。

关 键 词:中文自动文摘  Siamese  LSTM  自然语言处理  深度学习

CHINESE MULTI-DOCUMENT AUTOMATIC SUMMARIZATION MODEL BASED ON SIAMESE LSTM
Gong Yonggang,Wang Jiaxin,Lian Xiaoqin,Pei Chenchen.CHINESE MULTI-DOCUMENT AUTOMATIC SUMMARIZATION MODEL BASED ON SIAMESE LSTM[J].Computer Applications and Software,2021,38(3):287-290,326.
Authors:Gong Yonggang  Wang Jiaxin  Lian Xiaoqin  Pei Chenchen
Affiliation:(Computer and Information Engineering College,Beijing Technology and Business University,Beijing 100048,China)
Abstract:With the rapid growth of text information, a multi-document automatic text summarization model is proposed to improve reading efficiency. On the basis of traditional abstract model, the Siamese LSTM deep learning network was applied to calculate text similarity and the Manhattan distance was calculated to represent the text similarity. Besides, the network model was improved by removing the stop words to improve the computing efficiency. The experimental results show that compared with the traditional method of calculating the cosine value, the generated summarization is closer to the subject in terms of semantics, and the quality is higher. The efficiency of the summarization system has also been improved significantly.
Keywords:Chinese automatic summarization  Siamese LSTM  Natural language processing  Deep learning
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号