首页 | 本学科首页   官方微博 | 高级检索  
     

基于敏感属性熵的微聚集算法
引用本文:杨静,王超,张健沛.基于敏感属性熵的微聚集算法[J].电子学报,2014,42(7):1327-1337.
作者姓名:杨静  王超  张健沛
作者单位:哈尔滨工程大学计算机科学与技术学院, 黑龙江哈尔滨 150001
基金项目:国家自然科学基金(No .61370083,No .61073043,No .61073041);高等学校博士学科点专项科研基金(No .20112304110011,No .20122304110012);哈尔滨市科技创新人才研究专项资金(优秀学科带头人)
摘    要:在聚类过程中,不合适的距离度量会导致匿名过程中不必要的信息损失,因此对于不同类型的属性定义一个适当的距离度量一直是个难以解决的问题.本文提出语义属性的概念,并提出编码层次树来表示语义属性,有效地降低了匿名过程中的信息损失.在p-敏感k-匿名模型中,敏感属性值在聚类结果中分布不均匀会导致敏感信息泄露,因此本文提出一种基于敏感属性熵的微聚集算法,并提出匿名保护指数来描述隐私保护程度,在聚类过程中通过保证匿名保护指数最大,来提高敏感属性在聚类结果中分布的均匀程度,以应对背景知识攻击,降低隐私泄漏的风险.最后,通过实验验证了算法的合理性和有效性.

关 键 词:隐私保护  编码层次树  微聚集  p-敏感k匿名  敏感属性熵  
收稿时间:2013-06-25

Micro-Aggregation Algorithm Based on Sensitive Attribute Entropy
YANG Jing,WANG Chao,ZHANG Jian-pei.Micro-Aggregation Algorithm Based on Sensitive Attribute Entropy[J].Acta Electronica Sinica,2014,42(7):1327-1337.
Authors:YANG Jing  WANG Chao  ZHANG Jian-pei
Affiliation:College of Computer Science and Technology, Harbin Engineering University, Harbin, Heilongjiang 150001, China
Abstract:In the process of clustering,inappropriate distance measure leads to unnecessary loss of information during the anonymous process,so it is a difficult problem to define a proper distance measurement for different types of variables.We put forward the concept of semantic attribute,and propose a coding hierarchy tree to represent semantic attribute and to reduce the information loss in the anonymous process.In the p-sensitive k-anonymity model,the uneven distribution of the sensitive attribute values in the clustering results may cause sensitive information disclosure,so we propose a micro-aggregation algorithm based on sensitive attribute entropy.Moreover,we propose the concept of anonymous protection factor to describe the degree of privacy protection.During the process of clustering,in order to improve the uniformity of the distribution of sensitive attribute values in the clustering results,the algorithm ensures the maximum of anonymous protection factor,so it can deal with the background knowledge attack and reduce the risk of privacy leaking.Finally,the rationality and validity of the algorithm is verified by experiment.
Keywords:privacy preserving  code hierarchy tree  micro-aggregation  p-sensitive k-anonymity  sensitive attribute entropy
本文献已被 CNKI 等数据库收录!
点击此处可从《电子学报》浏览原始摘要信息
点击此处可从《电子学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号