首页 | 本学科首页   官方微博 | 高级检索  
     


Optimizing top-<Emphasis Type="Italic">k</Emphasis> retrieval: submodularity analysis and search strategies
Authors:Chaofeng Sha  Keqiang Wang  Dell Zhang  Xiaoling Wang  Aoying Zhou
Affiliation:1.School of Computer Science, Shanghai Key Laboratory of Intelligent Information Processing,Fudan University,Shanghai,China;2.Shanghai Key Laboratory of Trustworthy Computing,East China Normal University,Shanghai,China;3.Department of Computer Science and Information Systems, Birkbeck,University of London,London,UK
Abstract:The key issue in top-k retrieval, finding a set of k documents (from a large document collection) that can best answer a user’s query, is to strike the optimal balance between relevance and diversity. In this paper, we study the top-k retrieval problem in the framework of facility location analysis and prove the submodularity of that objective function which provides a theoretical approximation guarantee of factor 1?\(\frac{1}{e}\) for the (best-first) greedy search algorithm. Furthermore, we propose a two-stage hybrid search strategy which first obtains a high-quality initial set of top-k documents via greedy search, and then refines that result set iteratively via local search. Experiments on two large TREC benchmark datasets show that our two-stage hybrid search strategy approach can supersede the existing ones effectively and efficiently.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号