首页 | 本学科首页   官方微博 | 高级检索  
     

数据发布中面向多敏感属性的隐私保护方法
引用本文:杨晓春,王雅哲,王斌,于戈.数据发布中面向多敏感属性的隐私保护方法[J].计算机学报,2008,31(4):574-587.
作者姓名:杨晓春  王雅哲  王斌  于戈
作者单位:东北大学信息科学与工程学院,沈阳,110004
基金项目:新世纪优秀人才支持计划(NCET-06-0290),国家自然科学基金(60503036),霍英东教育基金会青年教师优选资助课题(104027)资助~~
摘    要:现有的隐私数据发布技术通常关注单敏感属性数据,直接应用于多敏感属性数据会导致大量隐私信息的泄漏.文中首次对多敏感属性数据发布问题进行详细研究,继承了基于有损连接对隐私数据进行保护的思想,提出了针对多敏感属性隐私数据发布的多维桶分组技术——MSB(Multi-Sensitive Bucketization).为了避免高复杂性的穷举方法,首先提出3种不同的线性时间的贪心算法:最大桶优先算法(MBF)、最大单维容量优先算法(MSDCF)和最大多维容量优先算法(MMDCF).另外,针对实际应用中发布数据的重要性差异,提出加权多维桶分组技术.实际数据集上的大量实验结果表明,所提出的前3种算法的附加信息损失度为0.04,而隐匿率都低于0.06.加权多维桶分组技术对数据拥有者定义的重要信息的可发布性达到70%以上.

关 键 词:数据发布  数据隐私  多敏感属性  有损连接  l-多样性
修稿时间:2007年11月26

Privacy Preserving Approaches for Multiple Sensitive Attributes in Data Publishing
YANG Xiao-Chun,WANG Ya-Zhe,WANG Bin,YU Ge.Privacy Preserving Approaches for Multiple Sensitive Attributes in Data Publishing[J].Chinese Journal of Computers,2008,31(4):574-587.
Authors:YANG Xiao-Chun  WANG Ya-Zhe  WANG Bin  YU Ge
Abstract:Current privacy preserving data publishing techniques concentrate on tables with only one sensitive attribute. However, most of the real-world applications contain multiple sensitive attributes. Directly applying the existing single-sensitive-attribute privacy preserving techniques often causes unexpected private information disclosure. This paper firstly discusses the problem of secure publishing data when sensitive data contains multi attributes, and then propose a multi-dimensional bucket grouping approach on the idea of lossy join, called Multi-Sensitive Bucketization (MSB). In order to avoid exhausting search, three specific line-time greedy based MSB algorithms are proposed, which are maximal-bucket first algorithm (MBF), maximal single-dimension-capacity first algorithm (MSDCF), and maximal multi-dimension-capacity first algorithm (MMDCF). In addition, according to the differences among published data, a weighted MSB approach is further proposed. Experimental results on the real-world datasets show that the addition information loss of the proposed MSB methods were not more than 0.04 and the suppression ratios were less than 0.06. The weighted MSB approach can guarantee more than 70% publishing ratio.
Keywords:data publishing  data privacy  multi-sensitive attributes  lossy join  l-diversity
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号