A graph-based cache for large-scale similarity search engines期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

A graph-based cache for large-scale similarity search engines

Authors:	Veronica Gil-Costa Mauricio Marin Carolina Bonacic Roberto Solar

Affiliation:	1.Universidad Nacional de San Luis,San Luis,Argentina;2.CONICET,San Luis,Argentina;3.CeBiB, Centre for Biotechnology and Bioengineering,Santiago,Chile;4.DIINF,Universidad de Santiago de Chile,Santiago,Chile;5.CITIAPS,Universidad de Santiago de Chile,Santiago,Chile

Abstract:	Large-scale similarity search engines are complex systems devised to process unstructured data like images and videos. These systems are deployed on clusters of distributed processors communicated through high-speed networks. To process a new query, a distance function is evaluated between the query and the objects stored in the database. This process relays on a metric space index distributed among the processors. In this paper, we propose a cache-based strategy devised to reduce the number of computations required to retrieve the top-k object results for user queries by using pre-computed information. Our proposal executes an approximate similarity search algorithm, which takes advantage of the links between objects stored in the cache memory. Those links form a graph of similarity among pre-computed queries. Compared to the previous methods in the literature, the proposed approach reduces the number of distance evaluations up to 60%.

Keywords:
本文献已被 SpringerLink 等数据库收录！