首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
一种考虑属性权重的隐私保护数据发布方法   总被引:1,自引:0,他引:1  
k-匿名模型是数据发布领域用于对原始待发布数据集进行匿名处理以阻止链接攻击的有效方法之一,但已有的k-匿名及其改进模型没有考虑不同应用领域对匿名发布表数据质量需求不同的问题.在特定应用领域不同准码属性对基于匿名发布表的数据分析任务效用的贡献程度是不同的,若没有根据发布表用途的差异区别处理各准码属性的泛化过程,将会导致泛化后匿名发布表数据效用较差、无法满足具体数据分析任务的需要.在分析不同应用领域数据分析任务特点的基础上,首先通过修正基本ODP目录系统建立适用于特定问题领域的概念泛化结构;然后在泛化过程中为不同准码属性的泛化路径设置权重以反映具体数据分析任务对各准码属性的不同要求;最后设计一种考虑属性权重的数据匿名发布算法WAK(QI weight-aware k-anonymity),这是一种灵活地保持匿名发布表数据效用的隐私保护问题解决方案.示例分析和实验结果表明,利用该方案求解的泛化匿名发布表在达到指定隐私保护目标的同时,能够保持较高的数据效用,满足具体应用领域特定数据分析任务对数据质量的要求.  相似文献   

2.
杨静  王波 《计算机研究与发展》2012,49(12):2603-2610
数据发布中的隐私保护技术一直是数据挖掘与信息安全领域关注的重要问题.目前大部分的研究都仅限于单敏感属性的隐私保护技术,而现实生活中存在着大量包含多敏感属性的数据信息.同时,随着个性需求的不断提出,隐私保护中的个性化服务越来越受研究者的关注.为了扩展单敏感属性数据的隐私保护技术以及满足个性化服务的需求问题,研究了数据发布过程中面向多敏感属性的个性化隐私保护方法.在单敏感属性l-多样性原则的基础上,引入基于值域等级划分的个性化定制方案,定义了多敏感属性个性化l-多样性模型,并提出了一种基于最小选择度优先的多敏感属性个性化l-多样性算法.实验结果表明:该方法不仅可以满足隐私个性化的需求,而且能有效地保护数据的隐私,减少信息的隐匿率,保证发布数据的可用性.  相似文献   

3.
不可信云计算环境下的数据隐私保护问题逐渐成为研究重点,而保护隐私的主要方法之一就是对数据库中的记录加密,但对密文进行排序、范围查询等操作较为困难。保序加密能使密文的大小顺序与明文保持一致,支持上述对密文的操作。2013年提出的mOPE(mutable Order-Preserving Encoding)可变保序编码是一种基于二叉搜索树编码的保序加密方法,支持任意的数据类型,且除了明文顺序外不泄露其他任何信息。由于保序编码可能随着插入或删除记录而变更,服务器额外开销较大。本文对此作出改进并提出cmOPE(custom and mutable Order-Preserving Encoding)方法,基于构造完全二叉搜索树来调整保序编码,降低了编码变更带来的额外开销。实验结果显示,修改了编码调整策略的mOPE方法有效地降低了服务器的计算开销,提高了对保序密文增删改的效率。  相似文献   

4.
基于可信计算的远程证明的方法中,二进制证明方法能反映系统平台当前配置的完整性状态,是动态的,但容易暴露隐私,而基于属性证书的证明将系统平台的配置信息隐藏,具有匿名性,但是静态的.将二进制方法嵌入到属性证书方法中,提出了一种动态属性可信证明(Dynamic Property Trusted Attestation DPTA)的协议.验证者通过模拟计算PCR值,并与证书中的PCR值进行比较,证明示证者的当前平台满足一定的安全属性,解决了暴露隐私和静态问题.实验表明这种证明方法能保护平台隐私,克服基于属性证书的静态特点,兼有实时性和保密性的特点.  相似文献   

5.
近年来,隐私保护事务数据发布得到了研究者的广泛关注.事务数据的稀疏性导致个体隐私保护与数据效用性之间很难达到平衡.目前已有的方法大多是基于分组的匿名模型,但该类模型依赖于攻击者背景知识,且发布的数据无法满足事务数据分析任务的需要.针对事务数据隐私保护发布的数据安全性与效用性不足,基于差分隐私与压缩感知理论,提出一种有效的面向应用的事务数据发布策略(transaction data publish strategy,TDPS).首先构建事务数据库的完整Trie项集树,然后基于压缩感知技术对项集树添加满足差分隐私约束的噪音得到含噪Trie项集树,最后在含噪树上进行频繁项集挖掘任务.实验结果表明,TDPS不仅能很好地保护隐私,而且能有效保持数据效用性,满足事务数据分析任务对数据质量的要求.  相似文献   

6.
由于云计算的诸多优势,用户倾向于将数据挖掘和数据分析等业务外包到专业的云服务提供商,然而随之而来的是用户的隐私不能得到保证.目前,众多学者关注云环境下敏感数据存储的隐私保护,而隐私保护数据分析的相关研究还比较少.但是如果仅仅为了保护数据隐私,而不对大数据进行挖掘分析,大数据也就失去了其潜在的巨大价值.本文提出了一种云计算环境下基于格的隐私保护数据发布方法,利用格加密构建隐私数据的安全同态运算方法,并且在此基础上实现了支持隐私保护的云端密文数据聚类分析数据挖掘服务.为保护用户数据隐私,用户将数据加密之后发布到云服务提供商,云服务提供商利用基于格的同态加密算法实现隐私保护的k-means、隐私保护层次聚类以及隐私保护DBSCAN数据挖掘服务,但云服务提供商并不能直接访问用户数据破坏用户隐私.与现有的隐私数据发布方法相比,论文的隐私数据发布基于格的最接近向量困难问题(CVP)和最短向量困难问题(SVP),具有很高的安全性.同时算法有效保持了密文数据间距离的精确性,与现有研究相比挖掘结果也具有更高的精确性和可用性.论文对方法的安全性进行了理论分析并设计实验对提出的隐私保护数据挖掘方法效率进行评估,实验结果表明本文提出的基于格的隐私保护数据挖掘算法与现有的方法相比具有更高的数据分析精确性和更高的计算效率.  相似文献   

7.
随着医学技术的进步和大数据时代的到来,在数据发布时如何对患者就诊记录中的敏感信息进行隐私保护成为当前的研究热点。针对医疗大数据在发布过程中隐私保护问题,提出了基于属性效用值排序法AUR-Tree(attribute utility value ranking-tree)差分隐私数据发布算法。该算法用属性效用值排序法衡量准标识属性对敏感属性的影响程度,以此作为迭代分割的度量依据,采用基于泛化的自顶向下迭代分割分类树技术,通过类等差法合理的分配隐私预算从而实现在医疗数据发布过程中的隐私保护。实验结果表明:该算法在极大地提高了数据的安全性、有效性和可用性的前提下,还保留了后续数据挖掘的价值。  相似文献   

8.
大数据为商业创新和社区服务带来了巨大利益.然而,由于大数据分析技术挖掘出的信息可能超出人们想象,隐私问题备受关注.介绍大数据分析方法及支撑架构,剖析大数据的安全与隐私保护相关技术,并提出一种基于云存储的大数据隐私保护方案.  相似文献   

9.
李卓  宋子晖  沈鑫  陈昕 《计算机应用》2021,41(9):2678-2686
针对移动群智感知(MCS)中在用户数据提交阶段的隐私保护困难和因隐私保护造成成本增加的问题,基于本地差分隐私(LDP)保护原理设计出用户提交数据属性联合隐私保护的CS-MVP算法和用户提交数据属性独立隐私保护的CS-MAP算法。首先,基于属性关系构建用户提交数据的隐私性模型和任务数据的可用性模型,利用CS-MVP和CS-MAP算法解决隐私性约束下的可用性最大化问题;并且在边缘计算支持的MCS场景中,构建用户提交数据隐私保护下的三层MCS架构。理论分析证明了两个算法分别在数据属性联合隐私约束下和数据属性独立隐私约束下的最优性。实验结果表明,在相同隐私预算和数据量下,相较于LoPub和PrivKV,基于CS-MVP和CS-MAP算法的用户提交数据恢复正确感知数据的准确率分别平均提高了26.94%、84.34%和66.24%、144.14%。  相似文献   

10.
自从属性基签名(attribute-based signature,ABS)的概念被提出后,ABS因其匿名性特征而受到广泛关注.ABS可以隐藏签名者的真实身份从而保护用户隐私,但其匿名性可能导致签名者滥用签名而无法进行追踪.同时在特定的应用场景,如在电子医疗或电子商务中需要保护一些敏感信息(如医疗技术、转账细节等)防止客户隐私信息泄露.为了解决数据传输中的敏感信息隐藏以及签名者滥用签名问题,提出了一种可追踪身份的属性基净化签名方案.方案的安全性在标准模型下规约于Computational Diffie-Hellman(CDH)困难问题假设.提出的方案不仅解决了敏感信息隐藏,保证了签名者的隐私,而且防止了签名者对签名的滥用.  相似文献   

11.
The publication of microdata is pivotal for medical research purposes, data analysis and data mining. These published data contain a substantial amount of sensitive information, for example, a hospital may publish many sensitive attributes such as diseases, treatments and symptoms. The release of multiple sensitive attributes is not desirable because it puts the privacy of individuals at risk. The main vulnerability of such approach while releasing data is that if an adversary is successful in identifying a single sensitive attribute, then other sensitive attributes can be identified by co-relation. A whole variety of techniques such as SLOMS, SLAMSA and others already exist for the anonymization of multiple sensitive attributes; however, these techniques have their drawbacks when it comes to preserving privacy and ensuring data utility. The extant framework lacks in terms of preserving privacy for multiple sensitive attributes and ensuring data utility. We propose an efficient approach (p, k)-Angelization for the anonymization of multiple sensitive attributes. Our proposed approach protects the privacy of the individuals and yields promising results compared with currently used techniques in terms of utility. The (p, k)-Angelization approach not only preserves the privacy by eliminating the threat of background join and non-membership attacks but also reduces the information loss thus improving the utility of the released information.  相似文献   

12.
基于杂度增益与层次聚类的数据匿名方法   总被引:2,自引:0,他引:2  
数据匿名是发布数据时对隐私信息进行保护的重要手段之一.对数据匿名的基本概念和应用模型进行了介绍,探讨了数据匿名结果应该满足的要求.为了抵制背景知识攻击,提出了一种基于杂度增益与层次聚类的数据匿名方法,该方法以杂度来度量敏感属性随机性,并以概化过程中信息损失最小、杂度增益最大的条件约束来控制聚类的合并过程,可以使数据匿名处理后的数据集在满足k-匿名模型和l-多样模型的同时,使数据概化的信息损失最小且敏感属性的取值均匀化.在实验部分,提出了一种对数据匿名结果进行评估的方法,该方法将匿名结果和原始数据进行对比,并从平均信息损失和平均杂度2个方面来评估数据匿名的质量.实验结果验证了以上方法的有效性.  相似文献   

13.
基于节点分割的社交网络属性隐私保护   总被引:2,自引:0,他引:2  
现有研究表明,社交网络中用户的社交结构信息和非敏感属性信息均会增加用户隐私属性泄露的风险.针对当前社交网络隐私属性匿名算法中存在的缺乏合理模型、属性分布特征扰动大、忽视社交结构和非敏感属性对敏感属性分布的影响等弱点,提出一种基于节点分割的隐私属性匿名算法.该算法通过分割节点的属性连接和社交连接,提高了节点的匿名性,降低了用户隐私属性泄露的风险.此外,量化了社交结构信息对属性分布的影响,根据属性相关程度进行节点的属性分割,能够很好地保持属性分布特征,保证数据可用性.实验结果表明,该算法能够在保证数据可用性的同时,有效抵抗隐私属性泄露.  相似文献   

14.
针对链接攻击导致的隐私泄露问题,以及为了尽可能减少匿名保护时产生的信息损失,提高发布数据集的可用性,提出一种面向个体的基于变长聚类的个性化匿名保护方法。该方法充分考虑记录权重值对聚类簇中心结果的影响,以提高数据的可用性,并对敏感属性值进行分级处理,将敏感属性值分成三个等级类,响应不同个体的保护需求。理论分析和实验结果表明,该方法能满足敏感属性个性化保护需求,同时可有效地降低信息损失,效率较高,生成的匿名数据集具有较好的可用性。  相似文献   

15.
The medical community is producing and manipulating a tremendous volume of digital data for which computerized archiving, processing and analysis is needed. Grid infrastructures are promising for dealing with challenges arising in computerized medicine but the manipulation of medical data on such infrastructures faces both the problem of interconnecting medical information systems to Grid middlewares and of preserving patients’ privacy in a wide and distributed multi-user system. These constraints are often limiting the use of Grids for manipulating sensitive medical data. This paper describes our design of a medical data management system taking advantage of the advanced gLite data management services, developed in the context of the EGEE project, to fulfill the stringent needs of the medical community. It ensures medical data protection through strict data access control, anonymization and encryption. The multi-level access control provides the flexibility needed for implementing complex medical use-cases. Data anonymization prevents the exposure of most sensitive data to unauthorized users, and data encryption guarantees data protection even when it is stored at remote sites. Moreover, the developed prototype provides a Grid storage resource manager (SRM) interface to standard medical DICOM servers thereby enabling transparent access to medical data without interfering with medical practice.  相似文献   

16.
With the rapid growth of social network applications, more and more people are participating in social networks. Privacy protection in online social networks becomes an important issue. The illegal disclosure or improper use of users’ private information will lead to unaccepted or unexpected consequences in people’s lives. In this paper, we concern on authentic popularity disclosure in online social networks. To protect users’ privacy, the social networks need to be anonymized. However, existing anonymization algorithms on social networks may lead to nontrivial utility loss. The reason is that the anonymization process has changed the social network’s structure. The social network’s utility, such as retrieving data files, reading data files, and sharing data files among different users, has decreased. Therefore, it is a challenge to develop an effective anonymization algorithm to protect the privacy of user’s authentic popularity in online social networks without decreasing their utility. In this paper, we first design a hierarchical authorization and capability delegation (HACD) model. Based on this model, we propose a novel utility-based popularity anonymization (UPA) scheme, which integrates proxy re-encryption with keyword search techniques, to tackle this issue. We demonstrate that the proposed scheme can not only protect the users’ authentic popularity privacy, but also keep the full utility of the social network. Extensive experiments on large real-world online social networks confirm the efficacy and efficiency of our scheme.  相似文献   

17.
一种基于聚类的数据匿名方法   总被引:10,自引:0,他引:10  
王智慧  许俭  汪卫  施伯乐 《软件学报》2010,21(4):680-693
为了防止个人隐私的泄漏,在数据共享前需要对其在准标识符上的属性值作数据概化处理,以消除链接攻击,实现在共享中对敏感属性的匿名保护.概化处理增加了属性值的不确定性,不可避免地会造成一定的信息损失.传统的数据概化处理大都建立在预先定义的概念层次结构的基础上,会造成过度概化,带来许多不必要的信息损失.将准标识符中的属性分为有序属性和无序属性两种类型,分别给出了更为灵活的相应数据概化策略.同时,通过考察数据概化前后属性值不确定性程度的变化,量化地定义了数据概化带来的信息损失.在此基础上,将数据匿名问题转化为带特定约束的聚类问题.针对l-多样模型,提出了一种基于聚类的数据匿名方法L-clustering.该方法能够满足在数据共享中对敏感属性的匿名保护需求,同时能够很好地降低实现匿名保护时概化处理所带来的信息损失.  相似文献   

18.
The race for innovation has turned into a race for data. Rapid developments of new technologies, especially in the field of artificial intelligence, are accompanied by new ways of accessing, integrating, and analyzing sensitive personal data. Examples include financial transactions, social network activities, location traces, and medical records. As a consequence, adequate and careful privacy management has become a significant challenge. New data protection regulations, for example in the EU and China, are direct responses to these developments. Data anonymization is an important building block of data protection concepts, as it allows to reduce privacy risks by altering data. The development of anonymization tools involves significant challenges, however. For instance, the effectiveness of different anonymization techniques depends on context, and thus tools need to support a large set of methods to ensure that the usefulness of data is not overly affected by risk-reducing transformations. In spite of these requirements, existing solutions typically only support a small set of methods. In this work, we describe how we have extended an open source data anonymization tool to support almost arbitrary combinations of a wide range of techniques in a scalable manner. We then review the spectrum of methods supported and discuss their compatibility within the novel framework. The results of an extensive experimental comparison show that our approach outperforms related solutions in terms of scalability and output data quality—while supporting a much broader range of techniques. Finally, we discuss practical experiences with ARX and present remaining issues and challenges ahead.  相似文献   

19.
Although k-anonymity is a good way of publishing microdata for research purposes, it cannot resist several common attacks, such as attribute disclosure and the similarity attack. To resist these attacks, many refinements of kanonymity have been proposed with t-closeness being one of the strictest privacy models. While most existing t-closeness models address the case in which the original data have only one single sensitive attribute, data with multiple sensitive attributes are more common in practice. In this paper, we cover this gap with two proposed algorithms for multiple sensitive attributes and make the published data satisfy t-closeness. Based on the observation that the values of the sensitive attributes in any equivalence class must be as spread as possible over the entire data to make the published data satisfy t-closeness, both of the algorithms use different methods to partition records into groups in terms of sensitive attributes. One uses a clustering method, while the other leverages the principal component analysis. Then, according to the similarity of quasiidentifier attributes, records are selected from different groups to construct an equivalence class, which will reduce the loss of information as much as possible during anonymization. Our proposed algorithms are evaluated using a real dataset. The results show that the average speed of the first proposed algorithm is slower than that of the second proposed algorithm but the former can preserve more original information. In addition, compared with related approaches, both proposed algorithms can achieve stronger protection of privacy and reduce less.  相似文献   

20.
Being able to release and exploit open data gathered in information systems is crucial for researchers, enterprises and the overall society. Yet, these data must be anonymized before release to protect the privacy of the subjects to whom the records relate. Differential privacy is a privacy model for anonymization that offers more robust privacy guarantees than previous models, such as k-anonymity and its extensions. However, it is often disregarded that the utility of differentially private outputs is quite limited, either because of the amount of noise that needs to be added to obtain them or because utility is only preserved for a restricted type and/or a limited number of queries. On the contrary, k-anonymity-like data releases make no assumptions on the uses of the protected data and, thus, do not restrict the number and type of doable analyses. Recently, some authors have proposed mechanisms to offer general-purpose differentially private data releases. This paper extends such works with a specific focus on the preservation of the utility of the protected data. Our proposal builds on microaggregation-based anonymization, which is more flexible and utility-preserving than alternative anonymization methods used in the literature, in order to reduce the amount of noise needed to satisfy differential privacy. In this way, we improve the utility of differentially private data releases. Moreover, the noise reduction we achieve does not depend on the size of the data set, but just on the number of attributes to be protected, which is a more desirable behavior for large data sets. The utility benefits brought by our proposal are empirically evaluated and compared with related works for several data sets and metrics.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号