Probabilistic skylines on uncertain data: model and bounding-pruning-refining methods |
| |
Authors: | Bin Jiang Jian Pei Xuemin Lin Yidong Yuan |
| |
Affiliation: | (1) School of Computing Science, Simon Fraser University, Burnaby, BC, Canada;(2) School of Computer Science and Engineering, The University of New South Wales and NICTA, Sydney, NSW, Australia |
| |
Abstract: | Uncertain data are inherent in some important applications. Although a considerable amount of research has been dedicated
to modeling uncertain data and answering some types of queries on uncertain data, how to conduct advanced analysis on uncertain
data remains an open problem at large. In this paper, we tackle the problem of skyline analysis on uncertain data. We propose a novel probabilistic skyline model where an uncertain object may take a probability to be in the skyline, and a p-skyline contains all objects whose skyline probabilities are at least p (0 < p ≤ 1). Computing probabilistic skylines on large uncertain data sets is challenging. We develop a bounding-pruning-refining
framework and three algorithms systematically. The bottom-up algorithm computes the skyline probabilities of some selected
instances of uncertain objects, and uses those instances to prune other instances and uncertain objects effectively. The top-down
algorithm recursively partitions the instances of uncertain objects into subsets, and prunes subsets and objects aggressively.
Combining the advantages of the bottom-up algorithm and the top-down algorithm, we develop a hybrid algorithm to further improve
the performance. Our experimental results on both the real NBA player data set and the benchmark synthetic data sets show
that probabilistic skylines are interesting and useful, and our algorithms are efficient on large data sets. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|