首页 | 本学科首页   官方微博 | 高级检索  
     

基于差分隐私的数据世系发布方法
引用本文:倪巍伟,沈涛,闫冬.基于差分隐私的数据世系发布方法[J].计算机学报,2020,43(3):573-586.
作者姓名:倪巍伟  沈涛  闫冬
作者单位:东南大学计算机科学与工程学院 南京 211189;东南大学计算机网络和信息集成教育部重点实验室 南京 211189
摘    要:数据世系描述数据产生、演化的机理和流程,对数据质量评估、数据恢复、数据分析有重要意义.伴随着数据共享的日益深化,对数据世系的主要表现结构世系工作流进行共享的需求也日益迫切.世系工作流中包含的节点模块,以及节点间的时序关系可能涉及数据所有者的隐私,对其进行共享不可避免地会带来隐私保护问题.已有研究侧重世系工作流局部映射关系的维持,对世系工作流可用性的重要表现--工作流时序约束关系维持效果较弱;也缺少对工作流相邻节点有向度分布隐私的保护.针对上述问题,引入输入/输出度序列(Input and Output Degree Sequence with Scale i,IO-iD)模型,在描述世系工作流节点度分布的同时,兼顾对工作流方向特性的提取;提出Previous-Next时序序列结构,描述工作流中节点与其邻接节点的子结构特征;在此基础上,提出基于差分隐私的隐私保护世系工作流发布算法DpriPP,实现弱背景知识依赖的隐私保护世系工作流发布与工作流时序依赖关系可用性的有效维持.理论分析和实验结果表明,所提算法在保护世系工作流局部相邻节点有向度分布隐私的同时,能有效维持世系工作流节点局部与整体时序依赖关系的可用性.

关 键 词:隐私保护  世系工作流  差分隐私  IO-iD序列模型  Previous-Next时序序列

Differential Privacy Based on Data Provenance Publishing Method
NI Wei-Wei,SHEN Tao,YAN Dong.Differential Privacy Based on Data Provenance Publishing Method[J].Chinese Journal of Computers,2020,43(3):573-586.
Authors:NI Wei-Wei  SHEN Tao  YAN Dong
Affiliation:(Department of Computer Science and Engineering,Southeast University,Nanjing 211189;Key Laboratory of Computer Network and Information Integration in Southeast University,Ministry of Education,Nanjing 211189)
Abstract:Data provenance describes the mechanism and process of data generation and evolution,which records information about the node module executions used to produce concrete data items,as well as those intermediate data items acting as parameters passed between nodes’ executions.Data provenance plays an important role in research and applications of data quality assessment,data recovery and data analysis.With increasingly deepening of data sharing,the need for publishing and sharing workflow of provenance,which is the main representation structure of data provenance,becomes increasingly urgent.However,the provenance workflow often contains private or confidential data.For instance,the node modules included in the provenance workflow,as well as the temporal relations among those nodes,may involve the privacy of the data owner.Direct release of provenance workflow will inevitably bring privacy protection issues.Privacy-preserving data provenance publication becomes an urgent problem to be solved.That is to say,the utility of published provenance workflow needs well maintaining under the premise that data privacy should not be disclosed.The existing research mainly focuses on the maintenance of the local mapping relationship of provenance workflow.For example,privacy protection process is implemented to the provenance workflow,which ensures that the degree of a single node would not be leaked,or sensitive mapping relationship would not be leaked in parallel with effective maintaining of provenance workflow’s overall input and output mapping relations.For another important manifestation of provenance workflow’s utility,i.e.temporal dependence among those front and corresponding back task nodes of the provenance workflow,the maintenance effect is relatively poor.As for the privacy of adjacent nodes distribution in the provenance workflow,the protection ability of existing methods is also far insufficient.To solve the above problems,the definition of input and output degree sequence with scale i model is introduced to describe the degree distribution of provenance workflow nodes.It provides a carrier for describing the utility and privacy-sensitive information of provenance workflow.As a by-product,it can also accommodate the extraction of directional characteristics of the provenance workflow well.The definition of Previous-Next sequence is further devised to describe the substructure distribution characteristics of the workflow.This structure can reduce the possible loss of workflow’s temporal relations in adding differential noise.By constructing schema of Previous-Next sequence,the substructure characteristics of those nodes and their adjacent nodes in workflow are captured,and the temporal constraints in workflow are maintained during the reconstruction process.On this basis,a differential privacybased on privacy-preserving provenance workflow publishing method DpriPP is proposed to implement a weak background knowledge-dependent privacy-preserving provenance workflow publication,it can also provide well maintenance to temporal dependance relations of the provenance workflow.Targeted experiments are designed to verify the effectiveness and privacy protection effect of our proposal.The theoretical analysis and experimental results demonstrate that the proposed algorithm can effectively maintain both the local and global temporal dependence relations of nodes in the workflow,while protecting the directional distribution privacy of the locally adjacent nodes in the provenance workflow well.
Keywords:privacy protection  provenance workflow  differential privacy  Input and Output Degree sequence with scale i  Previous-Next sequence
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号