首页 | 本学科首页   官方微博 | 高级检索  
     

面向轨迹数据发布的个性化差分隐私保护机制
引用本文:田丰,吴振强,鲁来凤,刘海,桂小林.面向轨迹数据发布的个性化差分隐私保护机制[J].计算机学报,2021,44(4):709-723.
作者姓名:田丰  吴振强  鲁来凤  刘海  桂小林
作者单位:陕西师范大学计算机科学学院 西安 710062;陕西师范大学数学与信息科学学院 西安 710062;贵州大学公共大数据国家重点实验室 贵阳 550025;西安交通大学计算机科学与技术学院 西安 710049
基金项目:国家自然科学基金项目(61602290,61902229,61672334,61802242,61802241);陕西省自然科学基础研究计划项目(2017JQ6038,2020JM-288);贵州省科技重大专项计划项目(2018BDKFJJ004);中央高校基本科研业务费(GK202103090,GK202103084)资助。
摘    要:移动互联网和智能手机的普及大大方便了人们的生活,并由此产生了大量的轨迹数据.通过对发布的轨迹数据进行分析,能够有效提高基于位置服务的质量,进而推动智慧城市相关应用的发展,例如智能交通管理、基础设计规划以及道路拥塞预警与检测.然而,由于轨迹数据中包含用户的敏感信息,直接发布原始的轨迹数据会对个人隐私造成严重威胁.差分隐私作为一种具备严格形式化定义、强隐私性保证的安全机制,已经被广泛应用于轨迹数据的发布中.但是,现有的方法假定用户具有相同的隐私偏好,并且为所有用户提供相同级别的隐私保护,这会导致对某些用户提供的隐私保护级别不足,而某些用户则获得过多的隐私保护.为满足不同用户的隐私保护需求,提高数据可用性,本文假设用户具备不同的隐私需求,提出了一种面向轨迹数据的个性化差分隐私发布机制.该机制利用Hilbert曲线提取轨迹数据在各个时刻的分布特征,生成位置聚簇,使用抽样机制和指数机制选择各个位置聚簇的代表元,进而利用位置代表元对原始轨迹数据进行泛化,从而生成待发布轨迹数据.在真实轨迹数据集上的实验表明,与基于标准差分隐私的方法相比,本文提出的机制在隐私保护和数据可用性之间提供了更好的平衡.

关 键 词:个性化差分隐私  HILBERT曲线  抽样机制  轨迹数据发布

A Sample Based Personalized Differential Privacy Mechanism for Trajectory Data Publication
TIAN Feng,WU Zhen-Qiang,LU Lai-Feng,LIU Hai,GUI Xiao-Lin.A Sample Based Personalized Differential Privacy Mechanism for Trajectory Data Publication[J].Chinese Journal of Computers,2021,44(4):709-723.
Authors:TIAN Feng  WU Zhen-Qiang  LU Lai-Feng  LIU Hai  GUI Xiao-Lin
Affiliation:(School of Compuler Science,Shaanri Normal Universily,Xi'an 710062;School of Malhemalics and Informalion Science,Shaanxi Normal Universily,Xi'an 710062;Slale key Laboralory of Public Big Dala,Guizhou Universily,Guiyang 550025;School of Com puler Science and Technology,Xi'an Jiaolong Universily,Xi'an 710049)
Abstract:The widespread of smart phones and mobile internet facilitates people’s lives.Meanwhile,a large number of users’ trajectory data are collected and analyzed to provide better location-based services.Publishing the trajectory data can benefit the applications such as the intelligent transportation management,infrastructure planning,and road congestion prediction and detection.As the trajectory data contains users’ sensitive information,the publication of the original trajectory data may lead to the privacy leakage risks.To solve this problem,researchers have proposed privacy-preserving schemes to obfuscate the original trajectory data.These schemes are mainly on the basis of partition-based privacy models,such as k-anonymity and confidence bounding.Thus,they cannot resist the inference analysis of attackers with background knowledge.As a de facto standard,differential privacy guarantees the privacy level of the released data set,and privacy leakage risk is not affected by the background knowledge of the attackers.However,the existing differentially private schemes provide the users with the same privacy level,while the users usually have various privacy preferences.These schemes may narrow down the scope of available trajectory data,since some users’ privacy preference cannot be guaranteed.In this paper,we propose a sample based personalized differential privacy mechanism for trajectory data publication,which provides users with different privacy budget.Firstly,a location clustering algorithm is designed based on the linear indexes generated by the Hilbert curve.Inspired by the space filling curve,we partition the location set and employ the Hilbert curve to traverse the space regions to generate linear indexes of the location set.Different from traditional two-dimensional data clustering algorithms,this algorithm takes linear indexes as input to generate clusters in one-dimensional space.The algorithm can maintain the distance and distribution characteristics of the locations in the trajectory data set.By linearly scanning the indexes,the location clusters can be effectively obtained.In addition,the algorithm does not need to set the same number of clusters on the location sets at different timestamps,but generates different numbers of clusters according to the different distributions of the location sets.Secondly,a generalization method is proposed to meet personalized differential privacy.This method takes into account the different privacy preferences of individuals,and generates the representative element of each location cluster in a personalized differential privacy way.Specifically,the method determines the selection probability of each location according to its privacy budget,and samples the locations in the clusters at each timestamp.Then,the exponential mechanism is employed to select the representative location of each cluster to ensure that the trajectory generalization process satisfies the personalized differential privacy.The privacy analysis confirms that the proposed mechanism satisfies the definition of personalized differential privacy.The experiments on real trajectory data set show that the proposed mechanism achieves better tradeoff between privacy protection and data utility,compared with the standard differential privacy mechanism.Moreover,the generated representative locations are taken from the original location set,thus it will not lead to the generation of meaningless representative locations,ensuring that the generalized trajectory data set can resist filtering attacks.
Keywords:personalized differential privacy  Hilbert curve  sample mechanism  trajectory data publication
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号