面向聚类的数据隐藏发布研究 Privacy-Preserving Data Publication for Clustering期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

面向聚类的数据隐藏发布研究

引用本文：	倪巍伟,陈耿,崇志宏,吴英杰.面向聚类的数据隐藏发布研究[J].计算机研究与发展,2012,49(5):1095-1104.

作者姓名：	倪巍伟陈耿崇志宏吴英杰

作者单位：	1. 东南大学计算机科学与工程学院南京210096 2. 南京审计学院信息科学学院南京 211815

基金项目：	国家自然科学基金项目，东南大学网络与信息集成教育部重点实验室开放基金项目

摘要：	数据隐藏发布在保护数据隐私和维持数据可用性间寻求一种折中,近年来得到了研究者的持续关注.数据隐藏发布的起因和目标都源于数据的使用价值,聚类作为实现数据深层使用价值的一个重要步骤,在数据挖掘领域得到了广泛的研究.聚类对数据个体特征的依赖与隐藏操作弱化个体特征的主导思想间的矛盾,使得面向聚类的数据隐藏发布成为一个难点.对面向聚类的隐私保护数据发布领域已有研究成果进行了总结,从保存聚类特征粒度的角度,分析保存聚类特征粒度与聚类可用性、隐私保护安全性间的关系;从维持数据聚类可用性效果角度对匿名、随机化、数据交换、人工合成数据替换等主要隐藏方法的原理、特点进行了分析.在对已有技术方法深入对比分析的基础上,指出了面向聚类的数据隐藏发布领域待解决的一些难点问题和未来发展方向.
关键词：	隐私保护聚类挖掘数据隐藏聚类可用性数据发布
Privacy-Preserving Data Publication for Clustering

Ni Weiwei , Chen Geng , Chong Zhihong , Wu Yingjie.Privacy-Preserving Data Publication for Clustering[J].Journal of Computer Research and Development,2012,49(5):1095-1104.

Authors:	Ni Weiwei Chen Geng Chong Zhihong Wu Yingjie

Affiliation:	1(College of Computer Science and Engineering,Southeast University,Nanjing210096) 2(School of Information Science,Nanjing Audit University,Nanjing211815)

Abstract:	Privacy-preserving data publication has attracted sustained attention in recent years.It seeks a trade-off between preserving data privacy and maintaining data utility.Clustering is a crucial step for advanced data analysis,which has been widely studied in data mining.There exists some inconsistency between clustering and data obfuscation.Process of clustering heavily depends on characteristics of individual records to segment data into different clusters.On the contrary,the process of data obfuscation usually adopts the idea of suppressing individual characteristics for the sake of avoiding leakage of individual privacy.It becomes difficult to accommodate data privacy and clustering utility of the published data simultaneously.Various distortion and limited distribution techniques are delved into this problem.The state-of-the-art of data obfuscation methods for clustering application is surveyed.The constraint mechanism among clustering character granularities to be kept,clustering usability maintenance and security of data privacy is discussed.Further,the principles and merits of some prevalent methods,such as data anonymity,data randomization,data swapping and synthetic data substitution,are compared from a view of accommodating data privacy preservation and clustering usability maintenance.Following a comprehensive analysis of the existing techniques,some unaddressed problems and future directions are highlighted.

Keywords:	privacy-preservation clustering data obfuscation clustering utility data publication
本文献已被 CNKI 万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏