首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 125 毫秒
1.
傅鹤岗  曾凯 《计算机工程》2012,38(3):145-147,162
针对数据挖掘中私有信息的保护问题,提出一种多维敏感k-匿名隐私保护模型。将敏感属性泄露问题分为一般泄露、相似泄露、多维独立泄露、交叉泄露和多维混合数据泄露,在k-匿名的基础上,以聚类特性对多维敏感属性进行相似性标记,寻找匿名记录,计算剩余记录与已分组记录的相似性,泛化并发布满足匿名模型的数据集。实验结果表明,该模型适用于多维敏感数据,能防止隐私泄露,数据可用性较好。  相似文献   

2.
基于聚类的高效(K,L)-匿名隐私保护   总被引:1,自引:0,他引:1  
为防止发布数据中敏感信息泄露,提出一种基于聚类的匿名保护算法.分析易被忽略的准标识符对敏感属性的影响,利用改进的K-means聚类算法对数据进行敏感属性聚类,使类内数据更相似.考虑等价类内敏感属性的多样性,对待发布表使用(K,L)-匿名算法进行聚类.实验结果表明,与传统K-匿名算法相比,该算法在实现隐私保护的同时,数据信息损失较少,执行时间较短.  相似文献   

3.
徐龙琴  刘双印 《计算机应用》2011,31(4):999-1002
针对现有k-匿名方法直接用于多敏感属性数据发布中存在大量隐私泄露的问题,提出一种基于语义相似和多维加权的联合敏感属性隐私保护算法。该算法通过语义相似性反聚类思想和灵活设置多敏感属性值的权值,实现了联合敏感属性值和语义多样性分组的隐私保护,并根据应用需要为数据提供不同的隐私保护力度。实验结果表明,该方法能有效保护数据隐私,增强了数据发布的安全性和实用性。  相似文献   

4.
将发布的数据用于微观数据表包含的敏感属性分析,同时保持个人隐私,是一个越来越重要的问题。当前,k-匿名模型用于保护隐私数据公布,然而当以身份公开为重点时,k-匿名模型在某种程度上并不能保护属性公开。基于此,提出了一种新的基于(p+,α)-敏感k-匿名隐私保护模型,敏感属性首先通过其敏感性进行分类,然后发布敏感属性归属的类别。与以往增强k-匿名模型不同,该模型允许发布更多的信息,但不会影响隐私。实验结果表明,新提出的模型可以显著降低违反保密性。  相似文献   

5.
差异化多敏感属性Lq-Diversity模型和算法   总被引:1,自引:0,他引:1  
针对多维敏感属性数据发布面临的一般泄露、交叉泄露、相似性泄露、多维独立泄露的威胁,本文提出了敏感属性敏感等级和敏感属性值敏感等级的概念,基于单维l-diversity模型,对各维敏感属性进行单独分组,提出了差异化多维敏感属性模型,验证了该模型在面向多敏感属性数据发布的安全性,并根据此模型提出了相应的DMSA算法,通过实验验证,该算法正确可行,且隐匿率和附加信息损失度的值都很低,数据可用性高,具有良好的隐私保护效果.  相似文献   

6.
保护隐私的(L,K) 匿名*   总被引:1,自引:1,他引:0  
提出了一种在K-匿名之上的科学与工程系(L,K)-匿名方法,用于对K-匿名后的数据进行保护,并给出了(L,K)-匿名算法.实验显示该方法能有效地消除K-匿名后秘密匿名属性信息的泄漏,增强了数据发布的安全性.  相似文献   

7.
面向查询服务的数据隐私保护算法   总被引:4,自引:0,他引:4  
个性化信息服务提高了Web查询精度,但同时也带来数据隐私保护的问题.尤其在面向服务的架构(SOA)中,部署个性化应用时,如何解决隐私保护,这对于个性化服务是一个挑战.随着隐私安全成为微数据发布过程中越来越重要的问题,好的匿名化算法就显得尤为重要.论文总结了前人研究中考虑到准标识符对敏感属性影响的k-匿名算法,提出了直接通过匿名化数据计算准标识符对敏感属性效用的方法以及改进的效用矩阵,同时为了更好地衡量匿名化数据的信息损失,论文中提出了改进的归一确定性惩罚的评价指标,从匿名化数据隐私安全的角度进行分析,实现了改进L-diversity算法,即基于信息损失惩罚的满足L-diversity的算法.它是准标识符对不同敏感属性效用的、并具有较好隐私安全的改进算法.  相似文献   

8.
随着大数据时代的到来,数据数量呈指数形式增长,一次性发布所有的数据已无法满足实时掌握数据的需求,提出(p, k)匿名增量更新算法,动态更新匿名发布数据表。为避免数据动态更新时造成隐私泄露,算法利用加密技术对敏感属性进行保护,建立暂存表及临时表辅助待更新数据及时插入。(p, k)匿名增量更新算法改善了传统算法无法实时更新数据的问题,保证了数据的实时性,并利用加密技术增强了数据的隐私保护性。实验结果表明,(p, k)匿名增量更新算法在较少信息损失量以及较快更新速率的情况下,实现了数据实时更新的目标。  相似文献   

9.
目前大多数个性化隐私保护算法,对敏感属性的保护方法可以分为两种:一种是对不同的敏感属性设置不同的阈值;另一种是泛化敏感属性,用泛化后的精度低的值取代原来的敏感属性值。两种方法匿名后的数据存在敏感信息泄露的风险或信息损失较大,以及数据可用性的问题。为此,提出个性化(p,α,k)匿名隐私保护算法,根据敏感属性的敏感等级,对等价类中各等级的敏感值采用不同的匿名方法,从而实现对敏感属性的个性化隐私保护。实验表明,该算法较其他个性化隐私保护算法有近似的时间代价,更低的信息损失。  相似文献   

10.
为了防止数据敏感属性的泄露,需要对数据敏感属性进行匿名保护。针对l-多样性模型当前已提出的算法大多是建立在概念层次结构的基础上,该方法会导致不必要的信息损失。为此,将基于属性泛化层次距离KACA算法中的距离度量方法与聚类结合,提出了一种基于聚类的数据敏感属性匿名保护算法。该算法按照l-多样性模型的要求对数据集进行聚类。实验结果表明,该算法既能对数据中的敏感属性值进行匿名保护,又能降低信息的损失程度。  相似文献   

11.
针对单敏感属性匿名化存在的局限性和关联攻击的危害问题,提出了基于贪心算法的(αij,k,m)-匿名模型。首先,该(αij,k,m)-匿名模型主要针对多敏感属性信息进行保护;然后,该模型为每个敏感属性的敏感值进行分级设置,有m个敏感属性就有m个分级表;其次,并为每个级别设置一个特定的αij;最后,设计了基于贪心策略的(αij,k,m)匿名化算法,采取局部最优方法,实现该模型的思想,提高了对数据的隐私保护程度,并从信息损失、执行时间、等价类敏感性距离三个方面对4个模型进行对比。实验结果证明,该模型虽然执行时间稍长,但信息损失量小,对数据的隐私保护程度高,能够抵制关联攻击,保护多敏感属性数据。  相似文献   

12.
Although k-anonymity is a good way of publishing microdata for research purposes, it cannot resist several common attacks, such as attribute disclosure and the similarity attack. To resist these attacks, many refinements of kanonymity have been proposed with t-closeness being one of the strictest privacy models. While most existing t-closeness models address the case in which the original data have only one single sensitive attribute, data with multiple sensitive attributes are more common in practice. In this paper, we cover this gap with two proposed algorithms for multiple sensitive attributes and make the published data satisfy t-closeness. Based on the observation that the values of the sensitive attributes in any equivalence class must be as spread as possible over the entire data to make the published data satisfy t-closeness, both of the algorithms use different methods to partition records into groups in terms of sensitive attributes. One uses a clustering method, while the other leverages the principal component analysis. Then, according to the similarity of quasiidentifier attributes, records are selected from different groups to construct an equivalence class, which will reduce the loss of information as much as possible during anonymization. Our proposed algorithms are evaluated using a real dataset. The results show that the average speed of the first proposed algorithm is slower than that of the second proposed algorithm but the former can preserve more original information. In addition, compared with related approaches, both proposed algorithms can achieve stronger protection of privacy and reduce less.  相似文献   

13.
p-Sensitive k-anonymity model has been recently defined as a sophistication of k-anonymity. This new property requires that there be at least p distinct values for each sensitive attribute within the records sharing a set of quasi-identifier attributes. In this paper, we identify the situations when the p-sensitive k-anonymity property is not enough for the sensitive attributes protection. To overcome the shortcoming of the p-sensitive k-anonymity principle, we propose two new enhanced privacy requirements, namely p+-sensitive k-anonymity and (p,α)-sensitive k-anonymity properties. These two new introduced models target at different perspectives. Instead of focusing on the specific values of sensitive attributes, p+-sensitive k-anonymity model concerns more about the categories that the values belong to. Although (p,α)-sensitive k-anonymity model still put the point on the specific values, it includes an ordinal metric system to measure how much the specific sensitive attribute values contribute to each QI-group. We make a thorough theoretical analysis of hardness in computing the data set that satisfies either p+-sensitive k-anonymity or (p,α)-sensitive k-anonymity. We devise a set of algorithms using the idea of top-down specification, which is clearly illustrated in the paper. We implement our algorithms on two real-world data sets and show in the comprehensive experimental evaluations that the two new introduced models are superior to the previous method in terms of effectiveness and efficiency.  相似文献   

14.
(α, k)-匿名模型未考虑敏感属性不同取值间的敏感性差异,不能很好地抵御同质性攻击。同时传统基于泛化的实现方法存在效率低、信息损失量大等缺点。为此,提出一种基于敏感性分级的(αi, k)-匿名模型,考虑敏感值之间的敏感性差异,引入有损连接思想,设计基于贪心策略的(?i, k)-匿名聚类算法。实验结果表明,该模型能抵御同质性攻击,是一种有效的隐私保护方法。  相似文献   

15.

k-Anonymity is one of the most well-known privacy models. Internal and external attacks were discussed for this privacy model, both focusing on categorical data. These attacks can be seen as attribute disclosure for a particular attribute. Then, p-sensitivity and p-diversity were proposed as solutions for these privacy models. That is, as a way to avoid attribute disclosure for this very attribute. In this paper we discuss the case of numerical data, and we show that attribute disclosure can also take place. For this, we use well-known rules to detect sensitive cells in tabular data protection. Our experiments show that k-anonymity is not immune to attribute disclosure in this sense. We have analyzed the results of two different algorithms for achieving k-anonymity. First, MDAV as a way to provide microaggregation and k-anonymity. Second, Mondrian. In fact, to our surprise, the number of cells detected as sensitive is quite significant, and there are no fundamental differences between Mondrian and MDAV. We describe the experiments considered, and the results obtained. We define dominance rule compliant and p%-rule compliant k-anonymity for k-anonymity taking into account attribute disclosure. We conclude with an analysis and directions for future research.

  相似文献   

16.
The publication of microdata is pivotal for medical research purposes, data analysis and data mining. These published data contain a substantial amount of sensitive information, for example, a hospital may publish many sensitive attributes such as diseases, treatments and symptoms. The release of multiple sensitive attributes is not desirable because it puts the privacy of individuals at risk. The main vulnerability of such approach while releasing data is that if an adversary is successful in identifying a single sensitive attribute, then other sensitive attributes can be identified by co-relation. A whole variety of techniques such as SLOMS, SLAMSA and others already exist for the anonymization of multiple sensitive attributes; however, these techniques have their drawbacks when it comes to preserving privacy and ensuring data utility. The extant framework lacks in terms of preserving privacy for multiple sensitive attributes and ensuring data utility. We propose an efficient approach (p, k)-Angelization for the anonymization of multiple sensitive attributes. Our proposed approach protects the privacy of the individuals and yields promising results compared with currently used techniques in terms of utility. The (p, k)-Angelization approach not only preserves the privacy by eliminating the threat of background join and non-membership attacks but also reduces the information loss thus improving the utility of the released information.  相似文献   

17.
目前多数l-多样性匿名算法对所有敏感属性值均作同等处理,没有考虑其敏感程度和具体分布情况,容易受到相似性攻击和偏斜性攻击;而且等价类建立时执行全域泛化处理,导致信息损失较高。提出一种基于聚类的个性化(lc)-匿名算法,通过定义最大比率阈值和不同敏感属性值的敏感度来提高数据发布的安全性,运用聚类技术产生等价类以减少信息损失。理论分析和实验结果表明,该方法是有效和可行的。  相似文献   

18.
如何对生产环境中经代码混淆的结构化数据集的敏感属性(字段)进行自动化识别、分类分级,已成为对结构化数据隐私保护的瓶颈。提出一种面向结构化数据集的敏感属性自动化识别与分级算法,利用信息熵定义了属性敏感度,通过对敏感度聚类和属性间关联规则挖掘,将任意结构化数据集的敏感属性进行识别和敏感度量化;通过对敏感属性簇中属性间的互信息相关性和关联规则分析,对敏感属性进行分组并量化其平均敏感度,实现敏感属性的分类分级。实验表明,该算法可识别、分类、分级任意结构化数据集的敏感属性,效率和精确率更高;对比分析表明,该算法可同时实现敏感属性的识别与分级,无须预知属性特征、敏感特征字典,兼顾了属性间的相关性和关联关系。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号