首页 | 本学科首页   官方微博 | 高级检索  
     

支持本地化差分隐私保护的k-modes聚类方法
引用本文:彭春春,陈燕俐,荀艳梅.支持本地化差分隐私保护的k-modes聚类方法[J].计算机科学,2021,48(2):105-113.
作者姓名:彭春春  陈燕俐  荀艳梅
作者单位:南京邮电大学计算机学院、软件学院、网络空间安全学院 南京 210003;南京邮电大学计算机学院、软件学院、网络空间安全学院 南京 210003;南京邮电大学计算机学院、软件学院、网络空间安全学院 南京 210003
摘    要:如何在保护数据隐私的同时进行可用性的数据挖掘已成为热点问题。鉴于在很多实际应用场景中,很难找到一个真正可信的第三方对用户的敏感数据进行处理,文中首次提出了一种支持本地化差分隐私技术的聚类方案——LDPK-modes(Local Differential Privacy K-modes)。与传统的基于中心化差分隐私的聚类算法相比,其不再需要一个可信的第三方对数据进行收集和处理,而由用户担任数据隐私化的工作,极大地降低了第三方窃取用户隐私的可能性。用户使用满足本地d-隐私(带有距离度量的本地差分隐私技术)定义的随机响应机制对敏感数据进行扰动,第三方收集到用户扰动数据后,恢复其统计特征,生成合成数据集,并进行k-modes聚类。在聚类过程中,将数据集上频繁出现的特征分配给初始聚类中心点,进一步提高了聚类结果的可用性。理论分析和实验结果表明了LDPK-modes的隐私性和聚类可用性。

关 键 词:本地化差分隐私  k-modes  d-隐私  聚类  隐私保护

k-modes Clustering Guaranteeing Local Differential Privacy
PENG Chun-chun,CHEN Yan-li,XUN Yan-mei.k-modes Clustering Guaranteeing Local Differential Privacy[J].Computer Science,2021,48(2):105-113.
Authors:PENG Chun-chun  CHEN Yan-li  XUN Yan-mei
Affiliation:(College of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210003,China)
Abstract:How to conduct usability data mining while protecting data privacy has become a hot issue.In many practical scena-rios,it is difficult to find a trusted third party to process the sensitive data.This paper proposes the first locally differentially private k-modes mechanism(LDPK-modes)under this distributed scenario.Differing from standard differentially private clustering mechanisms,the proposed mechanism doesn’t need any trusted third party to collect and preprocess users data.Users disturb their data using a random response mechanism that satisfies the definition of local d-privacy(local differential privacy with distance metric).When the third party collects the user’s disturbed data,it restores its statistical features and generates a synthetic data set.The frequent attributes on the data set are assigned to the initial cluster center and then start k-modes clustering.Theoretical analysis shows that the proposed algorithm satisfies local d-privacy.Experimental results show that our proposal can well preserve the quality of clustering results without a trusted third-party data collector.
Keywords:Local differential privacy  k-modes  d-privacy  Clustering  Privacy preserving
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号