首页 | 本学科首页   官方微博 | 高级检索  
     

基于概念统计的英文自动文摘研究
引用本文:万敏,罗振声,季妲,高小云. 基于概念统计的英文自动文摘研究[J]. 计算机工程与应用, 2002, 38(24): 7-9,16
作者姓名:万敏  罗振声  季妲  高小云
作者单位:清华大学人文学院计算语言学研究室,北京,100084
基金项目:国家自然科学基金项目(批准号:69972025)
摘    要:文章提出了一种基于概念统计和语义层次分析的自动文摘方法,并以此实现了一个英文自动文摘系统。系统利用WordNet对英文文章进行词语分析,用概念统计的方法选取文章的主题概念,以此构建向量空间模型;并根据主题概念在概念层次树上的分布划分意义块,以意义块为单位抽取文摘,初步解决多主题文章的文摘结构不平衡问题。该文主要介绍概念层次树的构造,主题概念的抽取步骤,句子重要度的计算和意义块的划分算法。测试表明该文提到的方法比传统的基于词频统计的方法有更高的召回率与精确率。

关 键 词:概念统计  主题概念  向量空间模型  句子重要度
文章编号:1002-8331-(2002)24-0007-03

Research on Automatic Summarization Based on Concept Counting for English Texts
Wan Min Luo Zhensheng Ji Heng Gao Xiaoyun. Research on Automatic Summarization Based on Concept Counting for English Texts[J]. Computer Engineering and Applications, 2002, 38(24): 7-9,16
Authors:Wan Min Luo Zhensheng Ji Heng Gao Xiaoyun
Abstract:This paper puts forward a new summarizing method based on concept counting and semantic hierarchy anal-ysis.Based on the extracted topic concepts,it constructs concept counting and semantic hierarchy analysis an effective English Text Summarizing system is developed.This system uses topic concepts to construct Vector Space Model.Combing with discourse analysis and readability improvement ,the abstract of a text is generated.This paper proposes the parame-ters of evaluating topic concepts,and mainly describes the detail algorithm of building concept hierarchy tree,extracting topic concepts and the application of topic concepts in generating abstracts.The experiment result shows that compared to word counting,this new method has enhanced both the recall rate and the precision rate of the system,and it helps to solve the abstract distribution problem of multi-topic texts.
Keywords:Concept counting  Topic concept   Vector space model  Sentence significance
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号