首页 | 本学科首页   官方微博 | 高级检索  
     

基于单汉字索引的全文检索系统的优化研究
引用本文:余海燕,张仲义.基于单汉字索引的全文检索系统的优化研究[J].中文信息学报,2001,15(4):15-19,27.
作者姓名:余海燕  张仲义
作者单位:北方交通大学自动化所
基金项目:86 3高技术资助项目!(86 3 - 30 6 -ZD - 0 7- 0 2 )
摘    要:对于按照单汉字建立倒排索引的全文检索系统,最需要解决的问题是如何提高其存储效率和运算速度。本文针对此问题提出了以下优化方法:一是利用参数化的Golomb编码对倒排文件进行压缩;二是对求集合交集的逻辑乘算法进行改进;三是运用并行计算和双缓冲技术。实验结果表明,经过优化后的单汉字全文检索系统已达到实用化的程度。

关 键 词:全文检索  单汉字标引  倒排文件  Golomb编码  

The Optimization of Full Text Retrieval System Based on Indexing of Single Chinese Character
YU Hai,yan,ZHANG Zhong,yi.The Optimization of Full Text Retrieval System Based on Indexing of Single Chinese Character[J].Journal of Chinese Information Processing,2001,15(4):15-19,27.
Authors:YU Hai  yan  ZHANG Zhong  yi
Affiliation:Automation System Institute of Northern Jiao Tong University
Abstract:This paper discusses the optimization of full text retrieval system based on "indexing of single Chinese character" from three aspects : the compression of inverted index file using Golomb coding method , the bidirectional binary-search intersection algorithm , the technique of parallel computing and double-buffer cache. The experiment shows that these optimizations introduce the less storage spending and higher performance to the system.
Keywords:full text retrieval  single Chinese character indexing  inverted file  Golomb coding
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号