首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 140 毫秒
1.
刘英华 《计算机科学》2013,40(Z6):349-353,383
匿名模型是近年来隐私保护研究的热点技术之一,主要研究如何在数据发布中既能避免敏感数据泄露,又能保证数据发布的高效用性。提出了一种(α[s],k)-匿名有损分解模型,该模型通过将敏感属性泛化成泛化树,根据数据发布中隐私保护的具体要求,给各结点设置不同的个性化α约束;基于数据库有损分解思想,将数据分解成敏感信息表和非敏感信息表,利用有损连接生成的冗余信息实现隐私保护。实验结果表明,该模型很好的个性化保护了数据隐私。  相似文献   

2.
隐私保护k-匿名算法研究   总被引:4,自引:0,他引:4       下载免费PDF全文
隐私保护已成为个人或组织机构关心的基本问题,k-匿名是目前数据发布环境下实现隐私保护的主要技术之一。鉴于多数k-匿名方法采用泛化和隐匿技术,严重依赖于预先定义的泛化层或属性域上的全序关系,产生很高的信息损失,降低了数据的可用性,提出了一种基于聚类技术的k-匿名算法。实验结果表明,该算法在保护隐私的同时,提高了发布数据的可用性。  相似文献   

3.
(α, k)-匿名模型未考虑敏感属性不同取值间的敏感性差异,不能很好地抵御同质性攻击。同时传统基于泛化的实现方法存在效率低、信息损失量大等缺点。为此,提出一种基于敏感性分级的(αi, k)-匿名模型,考虑敏感值之间的敏感性差异,引入有损连接思想,设计基于贪心策略的(?i, k)-匿名聚类算法。实验结果表明,该模型能抵御同质性攻击,是一种有效的隐私保护方法。  相似文献   

4.
《电子技术应用》2016,(12):115-118
K-匿名是信息隐私保护的一种常用技术,而使用K-匿名技术不可避免会造成发布数据的信息损失,因此,如何提高K-匿名化后数据集的可用性一直以来都是K-匿名隐私保护的研究重点。对此提出了一种基于抽样路径的局域泛化算法——SPOLG算法。该算法基于泛化格寻找信息损失较小的泛化路径,为减少寻径时间,引入等概率抽样的思想,选用等概率抽样中的系统抽样方法进行取样,利用样本代替数据集在泛化格上寻找目标泛化路径,最后在该路径上对数据集进行泛化。同时,本算法使用局域泛化技术,能够降低信息损失量,提高发布数据集的可用性。实验结果证明,本算法匿名化的数据集信息损失度低,数据可用性高。  相似文献   

5.
在数据发布中的隐私保护研究中,实现有损连接的方法主要有基于匿名模型方法和基于贪心策略的(α,k)匿名聚类方法.针对基于匿名模型方法存在的效率低以及基于贪心策略的(α,k)匿名聚类方法得到数据有效性差等不足,提出基于相似度的有损连接方法,该方法根据发布数据之间的相似性聚类得到有损连接的结果,解决了目前有损连接方法所存在的效率以及准确率问题.实验结果表明,该方法能够有效实现发布数据的隐私保护.  相似文献   

6.
隐私保护数据发布中身份保持的匿名方法   总被引:3,自引:0,他引:3  
在隐私保护的数据发布研究中,目前的方法通常都是先删除身份标识属性,然后对准标识属性进行匿名处理.分析了单一个体对应多个记录的情况,提出了一种保持身份标识属性的匿名方法,它在保持隐私的同时进一步提高了信息有效性.采用概化和有损连接两种实现方式.实验结果表明,该方法提高了信息有效性,具有很好的实用性.  相似文献   

7.
《软件》2017,(11):12-17
随着互联网技术的迅猛发展,隐私保护已成为社会以及机构越来越关心的问题,数据挖掘技术的应用使得隐私泄露问题日益突出,隐私保护是目前数据发布中隐私泄露控制技术研究的热点问题之一,而K-匿名是近年来隐私保护研究的热点。本文介绍了K-匿名的基本概念,阐述了泛化与隐匿技术,研究了基于datafly的多维属性泛化K-匿名模型,并对该模型的基本原理、缺点进行分析,做出了相应的改进,在数据预处理阶段增加泛化层限制并且在准标识符属性选取时引入近似度分析,并对改进后的K-匿名进行实验,实验结果证明改进有效提高了处理后的数据精度。  相似文献   

8.
个性化K-匿名模型   总被引:1,自引:0,他引:1  
K-匿名化是数据发布环境下保护数据隐私的一种方法.目前的K-匿名化方法主要是针对一些预定义的隐私泄露参数来进行隐私控制的.隐私保护的重要原则之一就是隐私信息的拥有者有隐私自治的权利[1].这就要求在实现匿名化过程当中考虑到个人不同的隐私需求,制定个性化的隐私约束.根据个人隐私自治的原则结合K-匿名模型的最新发展,提出了一种个性化K-匿名模型,并给出了基于局部编码和敏感属性泛化的个性化K-匿名算法.实验结果表明,该方法可以在满足个性化隐私需求的情况下,完成匿名化过程,并且采用该方法进行匿名所造成的信息损失较小.  相似文献   

9.
《计算机工程》2018,(1):176-181
现有匿名算法多数仅针对准标识符进行泛化实现隐私保护,未考虑敏感属性的个性化保护问题。为此,在p-sensitive k匿名模型的基础上设计敏感属性个性化隐私保护算法。根据用户自身的敏感程度定义敏感属性的敏感等级,利用敏感属性泛化树发布精度较低的敏感属性值,从而实现对敏感属性的个性化保护。实验结果表明,该算法可有效缩短执行时间,减少信息损失量,同时满足敏感属性个性化保护的要求。  相似文献   

10.
针对单敏感属性匿名化存在的局限性和关联攻击的危害问题,提出了基于贪心算法的(αij,k,m)-匿名模型。首先,该(αij,k,m)-匿名模型主要针对多敏感属性信息进行保护;然后,该模型为每个敏感属性的敏感值进行分级设置,有m个敏感属性就有m个分级表;其次,并为每个级别设置一个特定的αij;最后,设计了基于贪心策略的(αij,k,m)匿名化算法,采取局部最优方法,实现该模型的思想,提高了对数据的隐私保护程度,并从信息损失、执行时间、等价类敏感性距离三个方面对4个模型进行对比。实验结果证明,该模型虽然执行时间稍长,但信息损失量小,对数据的隐私保护程度高,能够抵制关联攻击,保护多敏感属性数据。  相似文献   

11.
It is not uncommon in the data anonymization literature to oppose the “old” \(k\) -anonymity model to the “new” differential privacy model, which offers more robust privacy guarantees. Yet, it is often disregarded that the utility of the anonymized results provided by differential privacy is quite limited, due to the amount of noise that needs to be added to the output, or because utility can only be guaranteed for a restricted type of queries. This is in contrast with \(k\) -anonymity mechanisms, which make no assumptions on the uses of anonymized data while focusing on preserving data utility from a general perspective. In this paper, we show that a synergy between differential privacy and \(k\) -anonymity can be found: \(k\) -anonymity can help improving the utility of differentially private responses to arbitrary queries. We devote special attention to the utility improvement of differentially private published data sets. Specifically, we show that the amount of noise required to fulfill \(\varepsilon \) -differential privacy can be reduced if noise is added to a \(k\) -anonymous version of the data set, where \(k\) -anonymity is reached through a specially designed microaggregation of all attributes. As a result of noise reduction, the general analytical utility of the anonymized output is increased. The theoretical benefits of our proposal are illustrated in a practical setting with an empirical evaluation on three data sets.  相似文献   

12.
We study the challenges of protecting privacy of individuals in the large public survey rating data in this paper. Recent study shows that personal information in supposedly anonymous movie rating records are de-identified. The survey rating data usually contains both ratings of sensitive and non-sensitive issues. The ratings of sensitive issues involve personal privacy. Even though the survey participants do not reveal any of their ratings, their survey records are potentially identifiable by using information from other public sources. None of the existing anonymisation principles (e.g., k-anonymity, l-diversity, etc.) can effectively prevent such breaches in large survey rating data sets. We tackle the problem by defining a principle called (k,e){(k,\epsilon)}-anonymity model to protect privacy. Intuitively, the principle requires that, for each transaction t in the given survey rating data T, at least (k − 1) other transactions in T must have ratings similar to t, where the similarity is controlled by e{\epsilon} . The (k,e){(k,\epsilon)} -anonymity model is formulated by its graphical representation and a specific graph-anonymisation problem is studied by adopting graph modification with graph theory. Various cases are analyzed and methods are developed to make the updated graph meet (k,e){(k,\epsilon)} requirements. The methods are applied to two real-life data sets to demonstrate their efficiency and practical utility.  相似文献   

13.
智能移动终端的普及导致收集的时空数据中个人位置隐私、签到数据隐私、轨迹隐私等敏感信息容易泄露,且当前研究分别针对上述隐私泄露单独提出保护技术,而没有面向用户给出防止上述隐私泄露的个性化时空数据隐私保护方法。针对这个问题,提出一种面向时空数据的个性化隐私保护模型(p,q,ε)-匿名和基于该模型的个性化时空数据隐私保护(PPPST)算法,从而对用户个性化设置的隐私数据(位置隐私、签到数据隐私和轨迹隐私)加以保护。设计了启发式规则对时空数据进行泛化处理,保证了发布数据的可用性并实现了时空数据的高可用性。对比实验中PPPST算法的数据可用率比个性化信息数据K-匿名(IDU-K)和个性化Clique Cloak(PCC)算法分别平均高约4.66%和15.45%。同时,设计了泛化位置搜索技术来提高算法的执行效率。基于真实时空数据进行实验测试和分析,实验结果表明PPPST算法能有效地保护个性化时空数据隐私。  相似文献   

14.
In privacy-preserving data mining (PPDM), a widely used method for achieving data mining goals while preserving privacy is based on k-anonymity. This method, which protects subject-specific sensitive data by anonymizing it before it is released for data mining, demands that every tuple in the released table should be indistinguishable from no fewer than k subjects. The most common approach for achieving compliance with k-anonymity is to replace certain values with less specific but semantically consistent values. In this paper we propose a different approach for achieving k-anonymity by partitioning the original dataset into several projections such that each one of them adheres to k-anonymity. Moreover, any attempt to rejoin the projections, results in a table that still complies with k-anonymity. A classifier is trained on each projection and subsequently, an unlabelled instance is classified by combining the classifications of all classifiers.Guided by classification accuracy and k-anonymity constraints, the proposed data mining privacy by decomposition (DMPD) algorithm uses a genetic algorithm to search for optimal feature set partitioning. Ten separate datasets were evaluated with DMPD in order to compare its classification performance with other k-anonymity-based methods. The results suggest that DMPD performs better than existing k-anonymity-based algorithms and there is no necessity for applying domain dependent knowledge. Using multiobjective optimization methods, we also examine the tradeoff between the two conflicting objectives in PPDM: privacy and predictive performance.  相似文献   

15.
Anonymization is a practical approach to protect privacy in data. The major objective of privacy preserving data publishing is to protect private information in data whereas data is still useful for some intended applications, such as building classification models. In this paper, we argue that data generalization in anonymization should be determined by the classification capability of data rather than the privacy requirement. We make use of mutual information for measuring classification capability for generalization, and propose two k-anonymity algorithms to produce anonymized tables for building accurate classification models. The algorithms generalize attributes to maximize the classification capability, and then suppress values by a privacy requirement k (IACk) or distributional constraints (IACc). Experimental results show that algorithm IACk supports more accurate classification models and is faster than a benchmark utility-aware data anonymization algorithm.  相似文献   

16.
With the proliferation of wireless sensor networks and mobile technologies in general, it is possible to provide improved medical services and also to reduce costs as well as to manage the shortage of specialized personnel. Monitoring a person’s health condition using sensors provides a lot of benefits but also exposes personal sensitive information to a number of privacy threats. By recording user-related data, it is often feasible for a malicious or negligent data provider to expose these data to an unauthorized user. One solution is to protect the patient’s privacy by making difficult a linkage between specific measurements with a patient’s identity. In this paper we present a privacy-preserving architecture which builds upon the concept of k-anonymity; we present a clustering-based anonymity scheme for effective network management and data aggregation, which also protects user’s privacy by making an entity indistinguishable from other k similar entities. The presented algorithm is resource aware, as it minimizes energy consumption with respect to other more costly, cryptography-based approaches. The system is evaluated from an energy-consuming and network performance perspective, under different simulation scenarios.  相似文献   

17.
Preserving individual privacy when publishing data is a problem that is receiving increasing attention. Thanks to its simplicity the concept of k-anonymity, introduced by Samarati and Sweeney [1], established itself as one fundamental principle for privacy preserving data publishing. According to the k-anonymity principle, each release of data must be such that each individual is indistinguishable from at least k−1 other individuals.  相似文献   

18.

k-Anonymity is one of the most well-known privacy models. Internal and external attacks were discussed for this privacy model, both focusing on categorical data. These attacks can be seen as attribute disclosure for a particular attribute. Then, p-sensitivity and p-diversity were proposed as solutions for these privacy models. That is, as a way to avoid attribute disclosure for this very attribute. In this paper we discuss the case of numerical data, and we show that attribute disclosure can also take place. For this, we use well-known rules to detect sensitive cells in tabular data protection. Our experiments show that k-anonymity is not immune to attribute disclosure in this sense. We have analyzed the results of two different algorithms for achieving k-anonymity. First, MDAV as a way to provide microaggregation and k-anonymity. Second, Mondrian. In fact, to our surprise, the number of cells detected as sensitive is quite significant, and there are no fundamental differences between Mondrian and MDAV. We describe the experiments considered, and the results obtained. We define dominance rule compliant and p%-rule compliant k-anonymity for k-anonymity taking into account attribute disclosure. We conclude with an analysis and directions for future research.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号