Privacy preservation for data cubes |
| |
Authors: | Sam Y Sung Yao Liu Hui Xiong Peter A Ng |
| |
Affiliation: | (1) Department of Computer Science, National University of Singapore, 3 Science Drive 2, Singapore, 117543;(2) MSIS Department, Rutgers University, USA;(3) Department of Computer Science, University of Texas—Pan American, Edinburg, TX, USA |
| |
Abstract: | A range query finds the aggregated values over all selected cells of an online analytical processing (OLAP) data cube where
the selection is specified by the ranges of contiguous values for each dimension. An important issue in reality is how to
preserve the confidential information in individual data cells while still providing an accurate estimation of the original
aggregated values for range queries. In this paper, we propose an effective solution, called the zero-sum method, to this
problem. We derive theoretical formulas to analyse the performance of our method. Empirical experiments are also carried out
by using analytical processing benchmark (APB) dataset from the OLAP Council. Various parameters, such as the privacy factor
and the accuracy factor, have been considered and tested in the experiments. Finally, our experimental results show that there
is a trade-off between privacy preservation and range query accuracy, and the zero-sum method has fulfilled three design goals:
security, accuracy, and accessibility.
Sam Y. Sung is an Associate Professor in the Department of Computer Science, School of Computing, National University of Singapore. He
received a B.Sc. from the National Taiwan University in 1973, the M.Sc. and Ph.D. in computer science from the University
of Minnesota in 1977 and 1983, respectively. He was with the University of Oklahoma and University of Memphis in the United
States before joining the National University of Singapore. His research interests include information retrieval, data mining,
pictorial databases and mobile computing. He has published more than 80 papers in various conferences and journals, including
IEEE Transaction on Software Engineering, IEEE Transaction on Knowledge & Data Engineering, etc.
Yao Liu received the B.E. degree in computer science and technology from Peking University in 1996 and the MS. degree from the Software
Institute of the Chinese Science Academy in 1999. Currently, she is a Ph.D. candidate in the Department of Computer Science
at the National University of Singapore. Her research interests include data warehousing, database security, data mining and
high-speed networking.
Hui Xiong received the B.E. degree in Automation from the University of Science and Technology of China, Hefei, China, in 1995, the
M.S. degree in Computer Science from the National University of Singapore, Singapore, in 2000, and the Ph.D. degree in Computer
Science from the University of Minnesota, Minneapolis, MN, USA, in 2005. He is currently an Assistant Professor of Computer
Information Systems in the Management Science & Information Systems Department at Rutgers University, NJ, USA. His research
interests include data mining, databases, and statistical computing with applications in bioinformatics, database security,
and self-managing systems. He is a member of the IEEE Computer Society and the ACM.
Peter A. Ng is currently the Chairperson and Professor of Computer Science at the University of Texas—Pan American. He received his Ph.D.
from the University of Texas–Austin in 1974. Previously, he had served as the Vice President at the Fudan International Institute
for Information Science and Technology, Shanghai, China, from 1999 to 2002, and the Executive Director for the Global e-Learning
Project at the University of Nebraska at Omaha, 2000–2003. He was appointed as an Advisory Professor of Computer Science at
Fudan University, Shanghai, China in 1999. His recent research focuses on document and information-based processing, retrieval
and management. He has published many journal and conference articles in this area. He had served as the Editor-in-Chief for
the Journal on Systems Integration (1991–2001) and as Advisory Editor for the Data and Knowledge Engineering Journal since
1989. |
| |
Keywords: | Privacy preservation OLAP Random data distortion Range query |
本文献已被 SpringerLink 等数据库收录! |
|