首页 | 本学科首页   官方微博 | 高级检索  
     

搜索引擎中混合型分布式索引组织策略
引用本文:陈伟,刘康苗,卜佳俊,陈纯,张利军.搜索引擎中混合型分布式索引组织策略[J].浙江大学学报(自然科学版 ),2009,43(8):1361-1366.
作者姓名:陈伟  刘康苗  卜佳俊  陈纯  张利军
作者单位:(浙江大学 计算机科学与技术学院, 浙江 杭州 310027)
摘    要:针对搜索引擎中索引组织策略在查询性能和可扩展性等方面存在的问题,提出了一种混合型分布式索引组织策略(Loc-Glob)。该策略整合了局部和全局索引组织的基本思路,首先将搜索引擎系统的索引服务器从逻辑上分为若干个索引服务器池,索引数据先以局部(或全局)索引组织策略分配到索引服务器池上。然后,在索引服务器池的内部,索引继续以全局(或局部)索引组织的方式存储到各索引服务器上。混合型的索引组织策略较局部和全局索引组织策略具有更好的可扩展性。实验结果表明,该策略较全局索引组织策略在查询性能、负载均衡方面都有所提升,与局部索引组织策略的查询性能基本相当,并具备较高的负载均衡水平。

关 键 词:搜索引擎  倒排索引  分布式索引组织  查询性能  负载均衡

Hybrid strategy to distributed index organization in search engine
CHEN Wei,LIU Kang-Miao,BO Jia-Jun,CHEN Chun,ZHANG Li-jun.Hybrid strategy to distributed index organization in search engine[J].Journal of Zhejiang University(Engineering Science),2009,43(8):1361-1366.
Authors:CHEN Wei  LIU Kang-Miao  BO Jia-Jun  CHEN Chun  ZHANG Li-jun
Affiliation:College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
Abstract:A hybrid index organization strategy named Loc-Glob was proposed to enhance the query performance and scalability in search engine. Loc-Glob integrates two well-studied index partitioning schemes, which are widely used in search engines. Firstly, index is partitioned according to local (or global) index organization strategy, taking cluster of some index servers as a single machine. Then, index distributed to certain cluster are further partitioned to index servers according to the global (or local) index organization strategy inside the cluster. Loc-Glob is more scalable than the traditional strategies to accommodate the explosively growing web pages. Experimental results indicate that the throughput of Loc-Glob outperforms the global index organization while it is very close to the local index organization, and Loc-Glob provides good load-balancing level.
Keywords:search engine  inverted index  distributed index organization  query performance  load balancing
点击此处可从《浙江大学学报(自然科学版 )》浏览原始摘要信息
点击此处可从《浙江大学学报(自然科学版 )》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号