首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于文章主题和内容的自动摘要方法
引用本文:陈燕敏,王晓龙,刘远超,楼喜中. 一种基于文章主题和内容的自动摘要方法[J]. 计算机工程与应用, 2004, 40(33): 11-14
作者姓名:陈燕敏  王晓龙  刘远超  楼喜中
作者单位:哈尔滨工业大学计算机科学与技术系,黑龙江,哈尔滨,150001;哈尔滨工业大学计算机科学与技术系,黑龙江,哈尔滨,150001;哈尔滨工业大学计算机科学与技术系,黑龙江,哈尔滨,150001;哈尔滨工业大学计算机科学与技术系,黑龙江,哈尔滨,150001
基金项目:国家自然科学基金(编号:60373100),国家863高技术研究发展计划项目基金(编号:2002AA117010-09)
摘    要:文章介绍了一种新的使用自然语言处理技术的自动系统。通过融合基于内容的方法和基于主题的方法,将主题与内容相结合,生成具有良好连贯性和流畅性的。该方法首先对主题词进行分析,动态地处理具有抽象标题和具体标题的文档;然后采用词汇、语法、语义分析等自然语言处理技术,对文章的文本内容进行深入分析;再根据线性加权融合两种分析得到的结果,生成;最后采用指代消解技术使生成的更连贯流畅。与仅基于内容的自动文摘系统相比较,评测结果显示,该系统生成的质量有明显提高。

关 键 词:自动文摘  自然语言处理  主题分析  内容分析  融合
文章编号:1002-8331-(2004)33-0011-04

Automatic Text Summarization Based on Topic and Content
Chen Yanmin Wang Xiaolong Liu Yuanchao Lou Xizhong. Automatic Text Summarization Based on Topic and Content[J]. Computer Engineering and Applications, 2004, 40(33): 11-14
Authors:Chen Yanmin Wang Xiaolong Liu Yuanchao Lou Xizhong
Abstract:A new system using Natural Language Processing techniques is proposed.It processes documents not only based on content of original texts by analyzing its structure,but also based on topics of summaries,which are determined by user or text title.The method first analyzes subjective words and processes the document with abstract title or actual title separately;then a method based on content is adopted by integrated many kinds of NLP technologies;the results produced by above two methods are fused to generate the summary;the anaphora resolution technology is applied to improve the fluency of the summary last.Evaluation results show that quality summaries are produced from arbitrary Chinese text.The proposed system is compared to system based on content and it is shown that it produces either comparable or better summaries overall.
Keywords:automatic text summarization  Natural Language Processing(NLP)  topic analysis  content analysis  fusing
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号