首页 | 本学科首页   官方微博 | 高级检索  
     

多文档文摘句子优选算法研究
引用本文:张姝,赵铁军,姚超,郑德权.多文档文摘句子优选算法研究[J].电子与信息学报,2008,30(12):2921-2925.
作者姓名:张姝  赵铁军  姚超  郑德权
作者单位:哈尔滨工业大学计算机科学与技术学院,哈尔滨,150001
基金项目:国家自然科学基金 , 国家"863"计划项目(2006AA01 Z150)资助课题  
摘    要:该文通过对文摘句的选择问题进行分析,提出了一种文摘句优选方法,相对于传统的逐个添加句子生成文摘的方法,该文提出的方法是在一定范围内逐个删除句子生成文摘.该方法分两阶段进行句子选择,第1阶段获取候选文摘句子集合,采用了直接获取算法和基于冗余信息处理的获取算法.第2阶段逐步删除句子,分别以不同特征项作为衡量句子对候选文摘句子集合的贡献,提出了文摘句优选算法.以DUC2004为实验语料,通过经句子选择后生成文摘的ROUGE得分,验证了句子选择在文摘生成过程中的必要性,与基于冗余信息处理的句子选择方法比较,验证了该文提出算法的有效性.

关 键 词:句子优选  多文档文摘  冗余信息处理
收稿时间:2007-6-4
修稿时间:2007-11-12

Research on Sentence Optimum Selection Algorithm for Multi-Document Summarization
Zhang Shu,Zhao Tie-jun,Yao Chao,Zheng De-quan.Research on Sentence Optimum Selection Algorithm for Multi-Document Summarization[J].Journal of Electronics & Information Technology,2008,30(12):2921-2925.
Authors:Zhang Shu  Zhao Tie-jun  Yao Chao  Zheng De-quan
Affiliation:Department of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
Abstract:Analyzing sentences selection in summarization, an approach based on deleting sentences in a sentences set to obtain summary is proposed, which differs from the traditional method of adding sentences to get the summary. It has two stages, one is the process of obtaining the candidate summary sentences set with direct obtaining algorithm and redundancy-based obtaining algorithm, the other is the process of deleting sentences with sentences optimum algorithm. With DUC 2004 as the test corpus, the ROUGE value of summaries gotten by sentences selection proves the necessity of sentences optimum selection for multi-document summarization. Compared with the redundancy-based sentences selection method, the validity of the approach proposed is proved.
Keywords:Sentence optimum selection  Multi-document summarization  Redundancy information processing
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《电子与信息学报》浏览原始摘要信息
点击此处可从《电子与信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号