首页 | 本学科首页   官方微博 | 高级检索  
     

基于小波变换的分布式隐私保护聚类算法
引用本文:薛安荣,刘 彬,闻丹丹.基于小波变换的分布式隐私保护聚类算法[J].计算机应用,2014,34(4):1029-1033.
作者姓名:薛安荣  刘 彬  闻丹丹
作者单位:江苏大学 计算机科学与通信工程学院,江苏 镇江 212013
基金项目:国家自然科学基金资助项目
摘    要:针对现有隐私保护聚类算法无法满足效率与隐私之间较好折中的问题,提出一种基于安全多方计算(SMC)与数据扰动相结合的分布式隐私保护聚类算法。各数据方用小波变换实现数据压缩和信息隐藏,并用属性列的随机重排来防止数据重构可能产生的信息泄露。该算法仅使用压缩重排后的数据参与分布聚类计算,因此计算量和通信量小,算法效率高,而多重保护措施有效保护了隐私数据。因小波变换具有高保真性,所以聚类精度受小波变换的影响较小。理论分析和实验结果表明,所提算法安全高效,在处理高维数据时全局F测量值和执行效率优于基于Haar小波的离散余弦变换(DCT-H)算法,解决了效率与隐私之间的折中问题。

关 键 词:隐私保护  聚类  小波变换  安全多方计算  分布式
收稿时间:2013-09-29
修稿时间:2013-11-15

Privacy preserving clustering algorithm based on wavelet transform for distributed data
XUE Anrong LIU Bin WEN Dandan.Privacy preserving clustering algorithm based on wavelet transform for distributed data[J].journal of Computer Applications,2014,34(4):1029-1033.
Authors:XUE Anrong LIU Bin WEN Dandan
Affiliation:School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang Jiangsu 212013, China
Abstract:The existing privacy preserving clustering data mining algorithms cannot meet better trade-off between efficiency and privacy. To resolve this problem, a distributed privacy preserving clustering algorithm based on Secure Multi-party Computation (SMC) combined with perturbation was proposed. Data owners utilized the wavelet to achieve both data reduction and information hiding, and rearranged the attribute columns randomly to prevent data reconstruction which has potential danger of causing information disclosure. The proposed algorithm reduced computation and communication cost because it only used reduced data in its computation. Thus the efficiency of the algorithm was improved. At the same time, the incorporation of multiple protection measures in the computation effectively preserved data privacy. The clustering accuracy was less affected because of the high dependability of wavelet transform. The theoretical analysis and experimental results indicate that the proposed algorithm is secure and highly effective, and the overall F-measure and the efficiency of the proposed algorithm outperform the DCT-H (Discrete Cosine Transform-Haar) algorithm when dealing with high-dimensional datasets. Above all, it effectively resolves the trade-off issue between efficiency and privacy.
Keywords:
本文献已被 CNKI 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号