首页 | 本学科首页   官方微博 | 高级检索  
     

基于词关联度的文本检索系统
引用本文:丁立恺,夏勇明,钱松荣.基于词关联度的文本检索系统[J].微型电脑应用,2011,27(3):62-64,6.
作者姓名:丁立恺  夏勇明  钱松荣
作者单位:复旦大学通信科学与工程系,上海,200433
摘    要:基于对语料的统计分析,提出了词关联度的概念。通过对文本库中词语出现的频率,以及任意两个词语共同出现的频率进行统计,获得了各个词语之间的关联度,并使用这一参数对语义向量进行调整,可以有效地解决传统向量空间模型的单词依赖问题。结合倒排索引技术,实际建立了一个相当规模的文本检索系统。测试结果表明,系统具有较好的效果和良好的性能,具备实用价值。

关 键 词:词关联度  信息检索  向量空间模型  倒排索引

A Text Search System Based on Word Relation
Ding Likai,Xia YongMing,Qian SongRong.A Text Search System Based on Word Relation[J].Microcomputer Applications,2011,27(3):62-64,6.
Authors:Ding Likai  Xia YongMing  Qian SongRong
Affiliation:Ding Likai,XiaYongMing,QianSongRong (Department of Communication Science and Engineering,Fudan University,Shanghai 200433,China)
Abstract:This paper introduces the concept of word relation, which reflects the statistical property of a text collection. Word relations are defined by the number of documents containing certain word and word pairs. It is used in adjusting semantic vector to solve the word dependency problem in traditional vector space model. This paper has implemented a text search system based on word relation, also integrated with inverted index. Several design issues are discussed in detail. It shows both good precision and sat...
Keywords:Word Relation  Information Retrieval  VSM  Inverted Index  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号