Comparing data summaries for processing live queries over Linked Data期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Comparing data summaries for processing live queries over Linked Data

Authors:	J??rgen Umbrich Katja Hose Marcel Karnstedt Andreas Harth Axel Polleres

Affiliation:	1. Digital Enterprise Research Institute, National University of Ireland, Galway, Ireland 2. Max-Planck-Institut f??r Informatik, Saarbr??cken, Germany 3. Institute AIFB, Karlsruhe Institute of Technology, Karlsruhe, Germany

Abstract:	A growing amount of Linked Data??graph-structured data accessible at sources distributed across the Web??enables advanced data integration and decision-making applications. Typical systems operating on Linked Data collect (crawl) and pre-process (index) large amounts of data, and evaluate queries against a centralised repository. Given that crawling and indexing are time-consuming operations, the data in the centralised index may be out of date at query execution time. An ideal query answering system for querying Linked Data live should return current answers in a reasonable amount of time, even on corpora as large as the Web. In such a live query system source selection??determining which sources contribute answers to a query??is a crucial step. In this article we propose to use lightweight data summaries for determining relevant sources during query evaluation. We compare several data structures and hash functions with respect to their suitability for building such summaries, stressing benefits for queries that contain joins and require ranking of results and sources. We elaborate on join variants, join ordering and ranking. We analyse the different approaches theoretically and provide results of an extensive experimental evaluation.

Keywords:
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏