基于规则和统计的中文自动文摘系统 Research on Automatic Summarization Based on Rules and Statistics for Chinese Texts期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于规则和统计的中文自动文摘系统

引用本文：	傅间莲,陈群秀. 基于规则和统计的中文自动文摘系统[J]. 中文信息学报, 2006, 20(5): 12-18

作者姓名：	傅间莲陈群秀

作者单位：	清华大学计算机系智能技术与系统国家重点实验室

摘要：	自动文摘是自然语言处理领域里一个重要课题,本文在传统方法基础上提出了一种中文自动文摘的方法。在篇章结构分析里,我们提出了基于连续段落相似度的主题划分算法,使生成的文摘更具内容全面性与结构平衡性。同时结合了若干规则对生成的文摘初稿进行可读性加工处理,使最终生成的文摘更具可读性。最后提出了一种新的文摘评价方法(F-new-measure)对系统进行测试。系统测试表明该方法在不同文摘压缩率时,评价值均较为稳定。
关键词：	计算机应用中文信息处理自动文摘向量空间模型主题划分可读性评价
文章编号：	1003-0077（2006）05-0010-07
收稿时间：	2005-09-09
修稿时间：	2006-07-10
Research on Automatic Summarization Based on Rules and Statistics for Chinese Texts

FU Jian-lian,CHEN Qun-xiu. Research on Automatic Summarization Based on Rules and Statistics for Chinese Texts[J]. Journal of Chinese Information Processing, 2006, 20(5): 12-18

Authors:	FU Jian-lian CHEN Qun-xiu

Affiliation:	State Key Lab of Intelligent Technology and System , Department of Computer Science and Technology , Tsinghua University

Abstract:	As automatic summarization is an important research topic in the natural language processing,the paper presents an approach for Chinese text summarization on the basis of traditional methods.For text structure analysis,an algorithm is proposed for multi-topic text partitioning based on sequential paragraphic similarity,which can makes the abstract of the multi-topic article have more general content and more balanced structure.Futhermore,a series of rules are combined to enhance the readability of the output abstract.Finally,a new evaluation method is put forward.The primary test shows that its value is stable.

Keywords:	computer application Chinese information processing automatic summarization vector space model topic segmentation readability evaluation
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏