首页 | 本学科首页   官方微博 | 高级检索  
     

基于日志分析的搜索引擎查询结果缓存研究
引用本文:马宏远,王斌.基于日志分析的搜索引擎查询结果缓存研究[J].计算机研究与发展,2012(Z1):224-228.
作者姓名:马宏远  王斌
作者单位:中国科学院计算技术研究所;中国科学院研究生院
基金项目:国家自然科学基金项目(60873166,61070111);国家“九七三”重点基础研究计划基金项目(2007CB311103);国家“八六三”高技术研究发展计划基金项目(2006AA010105);教育部科学技术研究重点项目(109028)
摘    要:缓存是有效减少响应时间和系统负载的关键技术,是搜索引擎系统结构研究的重要领域之一.通过对搜狗搜索引擎在近1个月内约1500万条用户查询日志进行分析和研究,针对查询结果缓存,从查询局部性、缓存策略、缓存容量、工作负载周期性等方面进行分析.分析表明,混合缓存策略以及提高缓存容量相结合的技术能有效提高搜索引擎系统性能.

关 键 词:信息检索  查询日志分析  性能优化  搜索引擎  缓存

Search Engine Query Results Caching Based on Log Analysis
Ma Hongyuan,and Wang Bin.Search Engine Query Results Caching Based on Log Analysis[J].Journal of Computer Research and Development,2012(Z1):224-228.
Authors:Ma Hongyuan  and Wang Bin
Affiliation:1(Institute of Computing Technology, Chinese Academy of Science, Beijing 100190) 2(Graduate University of Chinese Academy of Sciences, Beijing 100049)
Abstract:Caching is an effective technique to reduce user response time and back-end server workload and it is one of the most important research issues for search engine architecture. In this paper, we present an analysis of SOGOU search engine query logs consisting of approximately 15 million entries for search requests about one month and explore the problem of query results caching for search engines including locality of reference, caching policy, caching capacity and workload cycle. We also conduct a series of analyses with query results caching and explore the impact of those data. These analyses and experimental results show that the technique based on hybrid policy and increasing capacity can improve search engine system performance.
Keywords:information retrieval  query log analysis  performance optimization  search engine  caching
本文献已被 CNKI 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号