首页 | 本学科首页   官方微博 | 高级检索  
     

DCKPDP:改进k-prototype聚类的差分隐私混合属性数据发布方法
引用本文:张星,张兴.DCKPDP:改进k-prototype聚类的差分隐私混合属性数据发布方法[J].计算机应用研究,2022,39(1):249-253.
作者姓名:张星  张兴
作者单位:辽宁工业大学 电子与信息工程学院,辽宁 锦州 121001
基金项目:国家自然科学基金资助项目(61802161);辽宁省教育厅科学研究经费资助项目(JZL202015402,JZL202015404)。
摘    要:当前混合属性数据发布中隐私保护方法大多存在隐私保护效果不佳或数据效用较差的问题,采用差分隐私与优化的k-prototype聚类方法相结合,提出改进k-prototype聚类的差分隐私混合属性数据发布方法(DCKPDP)。为解决传统k-prototype聚类算法没有考虑不同数值型属性对聚类结果有较大影响的问题,利用信息熵为每个数值型属性添加属性权重;为解决聚类初始中心点人为规定或者由随机算法随机确定,导致聚类结果精确度不高的问题,结合数据对象的局部密度和高密度对聚类过程中初始中心点进行自适应选择;为解决数据信息泄露风险较高的问题,对聚类中心值进行差分隐私保护。实验结果表明,DCKPDP算法满足差分隐私保护所需的噪声量更小,数据的可用性更好。

关 键 词:差分隐私  混合属性数据  k-prototype聚类  密度优化  信息熵
收稿时间:2021/6/28 0:00:00
修稿时间:2021/12/17 0:00:00

Differential privacy mixed attribute data publishing method for improved k-prototype clustering
Zhang Xing and Zhang Xing.Differential privacy mixed attribute data publishing method for improved k-prototype clustering[J].Application Research of Computers,2022,39(1):249-253.
Authors:Zhang Xing and Zhang Xing
Affiliation:(School of Electronics&Information Engineering,Liaoning University of Technology,Jinzhou Liaoning 121001,China)
Abstract:Most of the current privacy protection methods in mixed attribute data publishing have problems of poor privacy protection effect or poor data utility. This paper proposed a differential privacy mixed attribute data publishing method(DCKPDP) based on improved k-prototype clustering. In order to solve the problem that the traditional k-prototype clustering algorithm did not consider the great influence of different numerical attributes on clustering results, using the information entropy added weight for each numerical attribute. In order to solve the problem of low accuracy of clustering results caused by artificial initial center points or randomly determined by random algorithm, combining the local density and high density of data objects carried out an adaptive selection of initial center points in the process of clustering. In order to solve the problem of high risk of data information leakage, using differential privacy protected the clustering center value. The experimental results show that the DCKPDP algorithm satisfies the requirement of differential privacy protection with less noise and better data availability.
Keywords:differential privacy  mixed attribute data  k-prototype clustering  density optimization  information entropy
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号