Sampling-based estimators for subset-based queries期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Sampling-based estimators for subset-based queries

Authors:	Shantanu Joshi Christopher Jermaine

Affiliation:	(1) Server Manageability, Oracle, 400 Oracle Parkway, Redwood Shores, CA 94065, USA;(2) Computer and Information Science and Engineering, University of Florida, Gainesville, FL 32611, USA

Abstract:	We consider the problem of using sampling to estimate the result of an aggregation operation over a subset-based SQL query, where a subquery is correlated to an outer query by a NOT EXISTS, NOT IN, EXISTS or IN clause. We design an unbiased estimator for our query and prove that it is indeed unbiased. We then provide a second, biased estimator that makes use of the superpopulation concept from statistics to minimize the mean squared error of the resulting estimate. The two estimators are tested over an extensive set of experiments. Material in this paper is based upon work supported by the National Science Foundation via grants 0347408 and 0612170.

Keywords:	Sampling Approximate query processing Aggregate query processing
本文献已被 SpringerLink 等数据库收录！