首页 | 本学科首页   官方微博 | 高级检索  
     

基于差分隐私的频繁序列模式挖掘算法
引用本文:李艳辉,刘浩,袁野,王国仁. 基于差分隐私的频繁序列模式挖掘算法[J]. 计算机应用, 2017, 37(2): 316-321. DOI: 10.11772/j.issn.1001-9081.2017.02.0316
作者姓名:李艳辉  刘浩  袁野  王国仁
作者单位:东北大学 计算机科学与工程学院, 沈阳 110819
基金项目:国家自然科学基金资助项目(61033007,61622202,61572119);国家973计划项目(2012CB316201);教育部中央高校基本科研业务费资助项目(N150402005)。
摘    要:针对当数据集含有敏感信息时,直接发布频繁序列模式本身及其支持度计数都有可能泄露用户隐私信息的问题,提出一种满足差分隐私(DP)的频繁序列模式挖掘(DP-FSM)算法。该算法利用向下封闭性质生成候选序列模式集,基于智能截断方法从候选模式中挑选出频繁的序列模式,最后采用几何机制对所选出模式的真实支持度添加噪声进行扰动。另外,为了提高挖掘结果的可用性,设计了一个阈值修正的策略来减小挖掘过程中的截断误差和传播误差。理论分析证明了该算法满足ε-差分隐私。实验结果表明了该算法在拒真率(FNR)和相对支持度误差(RSE)两个指标上明显低于对比算法PFS2,有效地提高了挖掘结果的准确度。

关 键 词:频繁序列挖掘  差分隐私  隐私保护  几何机制  数据挖掘  
收稿时间:2016-08-12
修稿时间:2016-09-10

Frequent sequence pattern mining with differential privacy
LI Yanhui,LIU Hao,YUAN Ye,WANG Guoren. Frequent sequence pattern mining with differential privacy[J]. Journal of Computer Applications, 2017, 37(2): 316-321. DOI: 10.11772/j.issn.1001-9081.2017.02.0316
Authors:LI Yanhui  LIU Hao  YUAN Ye  WANG Guoren
Affiliation:School of Computer Science and Engineering, Northeastern University, Shenyang Liaoning 110819, China
Abstract:Focusing on the issue that releasing frequent sequence patterns and the corresponding true supports may reveal the individuals' privacy when the data set contains sensitive information, a Differential Private Frequent Sequence Mining (DP-FSM) algorithm was proposed. Downward closure property was used to generate a candidate set of sequence patterns, smart truncating based technique was used to sample frequent patterns in the candidate set, and geometric mechanism was utilized to perturb the true supports of each sampled pattern. In addition, to improve the usability of the results, a threshold modification method was proposed to reduce truncation error and propagation error in mining process. The theoretical analysis show that the proposed method is ε-differentially private. The experimental results demonstrate that the proposed method has lower False Negative Rate (FNR) and Relative Support Error (RSE) than that of the comparison algorithm named PFS2, thus effectively improving the accuracy of mining results.
Keywords:frequent sequence mining   Differential Privacy (DP)   privacy protection   geometric mechanism   data mining
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号