首页 | 本学科首页   官方微博 | 高级检索  
     

基于SWN理论提取复合关键字系统的设计与实现
引用本文:周雅夫,马力,董洛兵. 基于SWN理论提取复合关键字系统的设计与实现[J]. 西安邮电学院学报, 2007, 12(5): 82-86
作者姓名:周雅夫  马力  董洛兵
作者单位:1. 西安邮电学院,计算机系,陕西,西安,710121
2. 西安邮电学院,信息中心,陕西,西安,710121
3. 西安电子科技大学图书馆,陕西,西安,710071
摘    要:实现了一个利用小世界网络模型(SWN)提取中文文档的关键字的系统。小世界网络模型具有两个统计性质:平均路径长度和聚类系数。本系统使用的算法首先对文档进行分词,以分词之间的相邻关系为边、以分词为节点构造文档结构图。然后计算每一个分词的平均路径长度变化量和聚类系数变化量,并且使用这两个变化量作为提取关键字的标准,最后按照一定策略合并关键字成复合关键字。本文首先详细介绍了小世界网络模型的概念和在关键字提取方面的应用,然后介绍了本系统的设计与实现,最后通过实验证明了该算法的正确性和有效性。

关 键 词:小世界网络  关键字提取  平均路径长度变化量  聚类系数变化量
文章编号:1007-3264(2007)05-0082-05
收稿时间:2007-03-04
修稿时间:2007-03-04

Design and implement of a system extracting keywords using SWN Theory
ZHOU Ya-fu,MA Li,DONG Luo-bin. Design and implement of a system extracting keywords using SWN Theory[J]. Journal of Xi'an Institute of Posts and Telecommunications, 2007, 12(5): 82-86
Authors:ZHOU Ya-fu  MA Li  DONG Luo-bin
Abstract:By using a model of small world network(SWN),a system extracting Key words:from Chinese documents is implemented.SWN has two statistical properties which are Average Path Length and Average Clustering Coefficient.Firstly a Chinese document is decomposed into single terms,and it is represented by a network: the nodes represent terms,and the edges represent the co-occurrence of terms, which can describe the semantic association relation between single terms of the document.Next the Average Path Length and Average Clustering Coefficient of each term are computed,and they are used to extracting keywords.Finally the extracted keywords are combined as compound keywords.This paper first introduces concepts about small world network and describes the application of SWN in keyword extracting.Then it shows the way to design and implement the system in detail.The experiment results show that the system is both reasonable and effective.
Keywords:small world network  keyword extracting  average path length increment  average clustering coefficient increment
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号