首页 | 本学科首页   官方微博 | 高级检索  
     

基于空间动态划分的差分隐私聚类算法
引用本文:张可铧,成卫青.基于空间动态划分的差分隐私聚类算法[J].计算机工程与应用,2021,57(2):97-103.
作者姓名:张可铧  成卫青
作者单位:1.南京邮电大学 计算机学院,南京 210023 2.东南大学 计算机网络和信息集成教育部重点实验室,南京 211189
基金项目:国家自然科学基金;计算机网络和信息集成教育部重点实验室资助项目
摘    要:差分隐私算法作为当前研究较多的隐私保护机制之一,有着广泛应用。目前有多种基于差分隐私保护的k]均值聚类算法,应用场景不一,各有缺陷。以往的算法通过均等划分数据集,构造等宽直方图进行聚类,这会导致没有数据分布的区域也被无差别插入噪声,影响聚类性能。针对这一点,提出了一种新的差分隐私聚类算法DPQTk]-means,先通过构建差分隐私四分树,用大小不一的自适应存储桶动态划分数据空间,充分表示数据集同时减少噪声插入,再进行k]均值聚类,证明了其满足ε]-差分隐私保护。实验结果表明,DPQTk]-means算法与以往的差分隐私聚类算法相比具有更好的聚类可用性,且能够在隐私保护水平较高的同时保持稳定的聚类性能。

关 键 词:差分隐私  四分树  动态划分  [k]均值  

Differential Privacy Clustering Algorithm Based on Spatial Dynamic Partition
ZHANG Kehua,CHENG Weiqing.Differential Privacy Clustering Algorithm Based on Spatial Dynamic Partition[J].Computer Engineering and Applications,2021,57(2):97-103.
Authors:ZHANG Kehua  CHENG Weiqing
Affiliation:1.School of Computer, Nanjing University of Posts & Telecommunications, Nanjing 210023, China 2.Key Laboratory of Computer Network & Information Integration of Ministry of Education, Southeast University, Nanjing 211189, China
Abstract:As one of the most popular privacy protection mechanisms, the differential privacy algorithm has been widely used. At present, there are a variety of k]-means clustering algorithms based on differential privacy protection. The application scenarios are different and each has its own defects. There is an algorithm that divides the data set and constructs the equal-width histograms for clustering. This causes the areas without data to be inserted into noise without any difference, which affects clustering performance. To solve this problem, a new differential privacy clustering algorithm DPQTk]-means is proposed. By constructing a differential privacy quad tree, the data space is dynamically divided by adaptive buckets of different sizes to fully represent the data set and reduce noise insertion, and then do k]-means clustering. It proves that it satisfies ε]-differential privacy protection. Experimental results show that the DPQTk]-means algorithm has better cluster availability than the previous differential privacy clustering algorithms, and can maintain stable clustering performance while maintaining a high level of privacy protection.
Keywords:differential privacy  quad tree  dynamic partition  [k]-means  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号