首页 | 本学科首页   官方微博 | 高级检索  
     

基于规则和统计的中文自动文摘系统
引用本文:傅间莲,陈群秀. 基于规则和统计的中文自动文摘系统[J]. 中文信息学报, 2006, 20(5): 12-18
作者姓名:傅间莲  陈群秀
作者单位:清华大学计算机系智能技术与系统国家重点实验室
摘    要:自动文摘是自然语言处理领域里一个重要课题,本文在传统方法基础上提出了一种中文自动文摘的方法。在篇章结构分析里,我们提出了基于连续段落相似度的主题划分算法,使生成的文摘更具内容全面性与结构平衡性。同时结合了若干规则对生成的文摘初稿进行可读性加工处理,使最终生成的文摘更具可读性。最后提出了一种新的文摘评价方法(F-new-measure)对系统进行测试。系统测试表明该方法在不同文摘压缩率时,评价值均较为稳定。

关 键 词:计算机应用  中文信息处理  自动文摘  向量空间模型  主题划分  可读性  评价  
文章编号:1003-0077(2006)05-0010-07
收稿时间:2005-09-09
修稿时间:2006-07-10

Research on Automatic Summarization Based on Rules and Statistics for Chinese Texts
FU Jian-lian,CHEN Qun-xiu. Research on Automatic Summarization Based on Rules and Statistics for Chinese Texts[J]. Journal of Chinese Information Processing, 2006, 20(5): 12-18
Authors:FU Jian-lian  CHEN Qun-xiu
Affiliation:State Key Lab of Intelligent Technology and System , Department of Computer Science and Technology , Tsinghua University
Abstract:As automatic summarization is an important research topic in the natural language processing,the paper presents an approach for Chinese text summarization on the basis of traditional methods.For text structure analysis,an algorithm is proposed for multi-topic text partitioning based on sequential paragraphic similarity,which can makes the abstract of the multi-topic article have more general content and more balanced structure.Futhermore,a series of rules are combined to enhance the readability of the output abstract.Finally,a new evaluation method is put forward.The primary test shows that its value is stable.
Keywords:computer application  Chinese information processing  automatic summarization  vector space model  topic segmentation  readability  evaluation
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号