首页 | 本学科首页   官方微博 | 高级检索  
     

基于主题概念抽取的多文档文摘方法
引用本文:宋宣辰,刘贵全. 基于主题概念抽取的多文档文摘方法[J]. 计算机工程, 2010, 36(4): 190-192
作者姓名:宋宣辰  刘贵全
作者单位:中国科学技术大学计算机科学与技术系,合肥,230027
基金项目:国家自然科学基金资助项目 
摘    要:提出一种应用于多文档文摘的有效概念抽取方法。利用WordNet中词语的同义和上下义关系进行语义消歧和概念树构造,通过概念优化算法进行主题概念抽取,建立概念向量空间模型并通过最大边缘相关方法得到文摘句。采用语义概念统计来替代传统的词形统计,能更准确地提取文档中的重要信息。DUC2005的评测结果表明,该方法比传统方法能获得更好的效果。

关 键 词:多文档文摘  概念树  概念抽取
修稿时间: 

Multi-document Summarization Method Based on Topic-concepts Extract
SONG Xuan-chen,LIU Gui-quan. Multi-document Summarization Method Based on Topic-concepts Extract[J]. Computer Engineering, 2010, 36(4): 190-192
Authors:SONG Xuan-chen  LIU Gui-quan
Affiliation:(Department of Computer Science and Technology, University of Science and Technology of China, Hefei 230027)
Abstract:In this paper, an effective topic-concepts extract method is proposed and applied in multi-document summarization. The synonymy and hyponymy in WordNet are used to process word semantic disambiguate and to merge concept-trees. The topic-concepts are extracted through the concept optimization method afterward. Using the topic-concepts, the Vector Space Model(VSM) is constructed and the summary is produced through Maximal Marginal Relevance(MMR) method. The special aspect of this method is that the word counting in traditional method is replaced by concept counting, and can get important information more exactly from the corpus. Experimental result on DUC2005 evaluation indicates that the method can produce better summary compared with traditional method.
Keywords:multi-document summarization  concept-trees  concept extract
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号