首页 | 本学科首页   官方微博 | 高级检索  
     

满足LDP的多维数据联合分布估计
引用本文:褚雪君,龙士工,刘海.满足LDP的多维数据联合分布估计[J].计算机系统应用,2022,31(8):230-238.
作者姓名:褚雪君  龙士工  刘海
作者单位:贵州大学 计算机科学与技术学院, 贵阳 550025;贵州大学 贵州省公共大数据重点实验室, 贵阳 550025
基金项目:国家自然科学基金(62062020, 62002081)
摘    要:多维数据的发布与分析可以产生巨大的价值, 但在数据收集阶段时常发生隐私泄露的问题. 传统的中心化差分隐私保护方法要求一个完全可信的第三方数据收集者来收集数据, 但在现实中很难找到一个完全可信的第三方数据收集者. 随着属性维度的增加, 数据收集者的求精处理工作(联合分布的计算)也成了一个亟待解决的问题. 针对上述问题提出一种适用于多值数据的本地化差分隐私保护算法(RR-LDP), 引入一元编码和瞬时随机响应技术用来在数据收集阶段保护个人隐私, 降低了通信开销; 在满足LDP的情况下, 结合期望最大化(EM)算法和LASSO回归模型, 提出了高效的多维数据联合分布估计算法(LREMH). 该算法用LASSO回归模型估计初始值, 用EM算法进行迭代计算. 理论分析和实验结果表明LREMH算法在精度和效率之间取得了平衡.

关 键 词:多维数据  本地化差分隐私  EM算法  LASSO回归  联合分布估计  隐私保护  随机响应
收稿时间:2021/11/24 0:00:00
修稿时间:2021/12/20 0:00:00

Joint Distribution Estimation for Multidimensional Data Based on LDP
CHU Xue-Jun,LONG Shi-Gong,LIU Hai.Joint Distribution Estimation for Multidimensional Data Based on LDP[J].Computer Systems& Applications,2022,31(8):230-238.
Authors:CHU Xue-Jun  LONG Shi-Gong  LIU Hai
Affiliation:College of Computer Science and Technology, Guizhou University, Guiyang 550025, China;Guizhou Provincial Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
Abstract:The release and analysis of multidimensional data can produce great value. However, privacy disclosure often occurs in the data collection phase. The traditional centralized differential privacy protection method requires a completely trusted third-party data collector, which is quite difficult to be found in practice. With the increase in attribute dimensions, the refinement of data collectors (the calculation of joint distribution) has also become an urgent problem to be solved. To address the above problems, this study proposes a localized differential privacy protection algorithm (RR-LDP) for multi-valued data. Unary coding and instantaneous random response technique are introduced to protect personal privacy in the data collection phase, which reduce communication overhead. With the combination of expectation maximization (EM) algorithm and LASSO regression model, the study puts forward an efficient joint distribution estimation algorithm (LREMH) for multidimensional data, which meets the requirement of LDP. The algorithm uses the LASSO regression model to estimate the initial value and employs the EM algorithm for iterative calculation. Theoretical analysis and experimental results show that the LREMH algorithm achieves a balance between accuracy and efficiency.
Keywords:multidimensional data  localized differential privacy  expectation maximization (EM) algorithm  LASSO regression  joint distribution estimation  privacy protection  random response
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号