首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Collaborative filtering (CF) methods are widely adopted by existing recommender systems, which can analyze and predict user “ratings” or “preferences” of newly generated items based on user historical behaviors. However, privacy issue arises in this process as sensitive user private data are collected by the recommender server. Recently proposed privacy-preserving collaborative filtering (PPCF) methods, using computation-intensive cryptography techniques or data perturbation techniques are not appropriate in real online services. In this paper, an efficient privacy-preserving item-based collaborative filtering algorithm is proposed, which can protect user privacy during online recommendation process without compromising recommendation accuracy and efficiency. The proposed method is evaluated using the Netflix Prize dataset. Experimental results demonstrate that the proposed method outperforms a randomized perturbation based PPCF solution and a homomorphic encryption based PPCF solution by over 14X and 386X, respectively, in recommendation efficiency while achieving similar or even better recommendation accuracy.  相似文献   

2.
提出一种基于非负矩阵分解的隐私保护协同过滤推荐算法.该算法在用户数据收集过程中采用随机扰动技术,并使用非负矩阵分解对数据进行处理,从而形成隐私保护功能,并在此基础上产生推荐.理论分析和实验结果表明,该算法在保护用户个人隐私的基础上,能够产生具有一定精确性的推荐结果.  相似文献   

3.
Geometric data perturbation for privacy preserving outsourced data mining   总被引:1,自引:1,他引:0  
Data perturbation is a popular technique in privacy-preserving data mining. A major challenge in data perturbation is to balance privacy protection and data utility, which are normally considered as a pair of conflicting factors. We argue that selectively preserving the task/model specific information in perturbation will help achieve better privacy guarantee and better data utility. One type of such information is the multidimensional geometric information, which is implicitly utilized by many data-mining models. To preserve this information in data perturbation, we propose the Geometric Data Perturbation (GDP) method. In this paper, we describe several aspects of the GDP method. First, we show that several types of well-known data-mining models will deliver a comparable level of model quality over the geometrically perturbed data set as over the original data set. Second, we discuss the intuition behind the GDP method and compare it with other multidimensional perturbation methods such as random projection perturbation. Third, we propose a multi-column privacy evaluation framework for evaluating the effectiveness of geometric data perturbation with respect to different level of attacks. Finally, we use this evaluation framework to study a few attacks to geometrically perturbed data sets. Our experimental study also shows that geometric data perturbation can not only provide satisfactory privacy guarantee but also preserve modeling accuracy well.  相似文献   

4.
Random-data perturbation techniques and privacy-preserving data mining   总被引:2,自引:4,他引:2  
Privacy is becoming an increasingly important issue in many data-mining applications. This has triggered the development of many privacy-preserving data-mining techniques. A large fraction of them use randomized data-distortion techniques to mask the data for preserving the privacy of sensitive data. This methodology attempts to hide the sensitive data by randomly modifying the data values often using additive noise. This paper questions the utility of the random-value distortion technique in privacy preservation. The paper first notes that random matrices have predictable structures in the spectral domain and then it develops a random matrix-based spectral-filtering technique to retrieve original data from the dataset distorted by adding random values. The proposed method works by comparing the spectrum generated from the observed data with that of random matrices. This paper presents the theoretical foundation and extensive experimental results to demonstrate that, in many cases, random-data distortion preserves very little data privacy. The analytical framework presented in this paper also points out several possible avenues for the development of new privacy-preserving data-mining techniques. Examples include algorithms that explicitly guard against privacy breaches through linear transformations, exploiting multiplicative and colored noise for preserving privacy in data mining applications.  相似文献   

5.
This paper explores the possibility of using multiplicative random projection matrices for privacy preserving distributed data mining. It specifically considers the problem of computing statistical aggregates like the inner product matrix, correlation coefficient matrix, and Euclidean distance matrix from distributed privacy sensitive data possibly owned by multiple parties. This class of problems is directly related to many other data-mining problems such as clustering, principal component analysis, and classification. This paper makes primary contributions on two different grounds. First, it explores independent component analysis as a possible tool for breaching privacy in deterministic multiplicative perturbation-based models such as random orthogonal transformation and random rotation. Then, it proposes an approximate random projection-based technique to improve the level of privacy protection while still preserving certain statistical characteristics of the data. The paper presents extensive theoretical analysis and experimental results. Experiments demonstrate that the proposed technique is effective and can be successfully used for different types of privacy-preserving data mining applications.  相似文献   

6.
There have been two methods for privacy- preserving data mining: the perturbation approach and the cryptographic approach. The perturbation approach is typically very efficient, but it suffers from a tradeoff between accuracy and privacy. In contrast, the cryptographic approach usually maintains accuracy, but it is more expensive in computation and communication overhead. We propose a novel perturbation method, called guided perturbation. Specifically, we focus on a central problem of privacy-preserving data mining—the secure scalar product problem of vertically partitioned data, and give a solution based on guided perturbation, with good, provable privacy guarantee. Our solution achieves accuracy comparable to the cryptographic solutions, while keeping the efficiency of perturbation solutions. Our experimental results show that it can be more than one hundred times faster than a typical cryptographic solution.  相似文献   

7.
基于多参数随机扰动的布尔规则挖掘   总被引:2,自引:0,他引:2  
陈芸  张伟  周霆  邹汉斌 《计算机工程》2006,32(10):63-65
在MASK算法基础上提出了基于多参数随机扰动后布尔规则的挖掘过程,通过对实验结果的评估分析,表明该算法能够提供较高的隐私保护,并讨论了隐私保护及挖掘精度之间的关系。最后对未来多参数随机扰动数据挖掘研究进行了展望。  相似文献   

8.
针对目前智能电表隐私保护方法存在对用户用电模式保护力度不足的问题,提出采用时延扰动来破坏数据波形,在智能电表数据可用性的基础上推导基于时间扰动的智能电表隐私保护模型,通过扰动智能电表数据发布时间来实现数据安全性与可用性的折中,并利用非侵入式负载监测算法对隐私安全性进行检测。实验结果表明,基于时间扰动的智能电表隐私保护方法能够有效地抑制电器切换事件的识别准确率,相比于随机扰动和充电电池方法有更好的抑制效率,多用户的聚合误差稳定在10%左右,同时在计费误差上有着优异的表现。  相似文献   

9.
There has been relatively little work on privacy preserving techniques for distance based mining. The most widely used ones are additive perturbation methods and orthogonal transform based methods. These methods concentrate on privacy protection in the average case and provide no worst case privacy guarantee. However, the lack of privacy guarantee makes it difficult to use these techniques in practice, and causes possible privacy breach under certain attacking methods. This paper proposes a novel privacy protection method for distance based mining algorithms that gives worst case privacy guarantees and protects the data against correlation-based and transform-based attacks. This method has the following three novel aspects. First, this method uses a framework to provide theoretical bound of privacy breach in the worst case. This framework provides easy to check conditions that one can determine whether a method provides worst case guarantee. A quick examination shows that special types of noise such as Laplace noise provide worst case guarantee, while most existing methods such as adding normal or uniform noise, as well as random projection method do not provide worst case guarantee. Second, the proposed method combines the favorable features of additive perturbation and orthogonal transform methods. It uses principal component analysis to decorrelate the data and thus guards against attacks based on data correlations. It then adds Laplace noise to guard against attacks that can recover the PCA transform. Third, the proposed method improves accuracy of one of the popular distance-based classification algorithms: K-nearest neighbor classification, by taking into account the degree of distance distortion introduced by sanitization. Extensive experiments demonstrate the effectiveness of the proposed method.  相似文献   

10.
李光  王亚东  苏小红 《计算机工程》2012,38(3):12-13,18
在现有的基于数据扰动的隐私保持分类挖掘算法中,扰动数据和原始数据相关联,对隐私数据的保护并不完善,且扰动算法和分类算法耦合度高,不适合在实际中使用。为此,提出一种基于概率论的隐私保持分类挖掘算法。扰动后可得到一组与原始数据独立同分布的数据,使扰动数据和原始数据不再相互关联,各种分类算法也可直接应用于扰动后的数据。  相似文献   

11.
Collaborative recommender systems offer a solution to the information overload problem found in online environments such as e-commerce. The use of collaborative filtering, the most widely used recommendation method, gives rise to potential privacy issues. In addition, the user ratings utilized in collaborative filtering systems to recommend products or services must be protected. The purpose of this research is to provide a solution to the privacy concerns of collaborative filtering users, while maintaining high accuracy of recommendations. This paper proposes a multi-level privacy-preserving method for collaborative filtering systems by perturbing each rating before it is submitted to the server. The perturbation method is based on multiple levels and different ranges of random values for each level. Before the submission of each rating, the privacy level and the perturbation range are selected randomly from a fixed range of privacy levels. The proposed privacy method has been experimentally evaluated with the results showing that with a small decrease of utility, user privacy can be protected, while the proposed approach offers practical and effective results.  相似文献   

12.
Current information visualization techniques assume unrestricted access to data. However, privacy protection is a key issue for a lot of real-world data analyses. Corporate data, medical records, etc. are rich in analytical value but cannot be shared without first going through a transformation step where explicit identifiers are removed and the data is sanitized. Researchers in the field of data mining have proposed different techniques over the years for privacy-preserving data publishing and subsequent mining techniques on such sanitized data. A well-known drawback in these methods is that for even a small guarantee of privacy, the utility of the datasets is greatly reduced. In this paper, we propose an adaptive technique for privacy preservation in parallel coordinates. Based on knowledge about the sensitivity of the data, we compute a clustered representation on the fly, which allows the user to explore the data without breaching privacy. Through the use of screen-space privacy metrics, the technique adapts to the user's screen parameters and interaction. We demonstrate our method in a case study and discuss potential attack scenarios.  相似文献   

13.
面向数据库应用的隐私保护研究综述   总被引:39,自引:3,他引:36  
随着数据挖掘和数据发布等数据库应用的出现与发展,如何保护隐私数据和防止敏感信息泄露成为当前面临的重大挑战.隐私保护技术需要在保护数据隐私的同时不影响数据应用.根据采用技术的不同,出现了数据失真、数据加密、限制发布等隐私保护技术.文中对隐私保护领域已有研究成果进行了总结,对各类隐私保护技术的基本原理、特点进行了阐述,还详细介绍了各类技术的典型应用,并重点介绍了当前该领域的研究热点:基于数据匿名化的隐私保护技术.在对已有技术深入对比分析的基础上,指出了隐私保护技术的未来发展方向.  相似文献   

14.
由于公有云不是可信的实体,通过公有云提供图像检索服务时,它可能会窃取图像数据的敏感信息。近年来,密文图像检索方法被提出,用于保护图像隐私。然而,传统的隐私保护图像检索方案搜索效率较低,且无法支持多用户场景。因此,提出一种基于访问控制安全高效的多用户外包图像检索方案。该方案采用一次一密和矩阵变换方法,实现基于欧几里得距离(简称欧氏距离)相似性的密文图像检索,并利用矩阵分解和代理重加密,实现多用户外包图像检索。采用局部敏感哈希算法构建索引,提高密文图像检索效率。特别地,提出一种基于角色多项式函数的轻量级访问控制策略,该策略能够灵活设定图像访问权限,防止恶意用户窃取隐私信息。安全性分析论证了所提方案能够保护图像和查询请求的机密性;实验结果表明所提方案能够达到高效的图像检索。  相似文献   

15.
Traditionally, many data mining techniques have been designed in the centralized model in which all data is collected and available in one central site. However, as more and more activities are carried out using computers and computer networks, the amount of potentially sensitive data stored by business, governments, and other parties increases. Different parties often wish to benefit from cooperative use of their data, but privacy regulations and other privacy concerns may prevent the parties from sharing their data. Privacy-preserving data mining provides a solution by creating distributed data mining algorithms in which the underlying data need not be revealed. In this paper, we present privacy-preserving protocols for a particular data mining task: learning a Bayesian network from a database vertically partitioned among two parties. In this setting, two parties owning confidential databases wish to learn the Bayesian network on the combination of their databases without revealing anything else about their data to each other. We present an efficient and privacy-preserving protocol to construct a Bayesian network on the parties' joint data.  相似文献   

16.
With the proliferation of healthcare data, the cloud mining technology for E-health services and applications has become a hot research topic. While on the other hand, these rapidly evolving cloud mining technologies and their deployment in healthcare systems also pose potential threats to patient’s data privacy. In order to solve the privacy problem in the cloud mining technique, this paper proposes a semi-supervised privacy-preserving clustering algorithm. By employing a small amount of supervised information, the method first learns a Large Margin Nearest Cluster metric using convex optimization. Then according to the trained metric, the method imposes multiplicative perturbation on the original data, which can change the distribution shape of the original data and thus protect the privacy information as well as ensuring high data usability. The experimental results on the brain fiber dataset provided by the 2009 PBC demonstrated that the proposed method could not only protect data privacy towards secure attacks, but improve the clustering purity.  相似文献   

17.
Yao Liu  Hui Xiong 《Information Sciences》2006,176(9):1215-1240
A data warehouse stores current and historical records consolidated from multiple transactional systems. Securing data warehouses is of ever-increasing interest, especially considering areas where data are sold in pieces to third parties for data mining practices. In this case, existing data warehouse security techniques, such as data access control, may not be easy to enforce and can be ineffective. Instead, this paper proposes a data perturbation based approach, called the cubic-wise balance method, to provide privacy preserving range queries on data cubes in a data warehouse. This approach is motivated by the following observation: analysts are usually interested in summary data rather than individual data values. Indeed, our approach can provide a closely estimated summary data for range queries without providing access to actual individual data values. As demonstrated by our experimental results on APB benchmark data set from the OLAP council, the cubic-wise balance method can achieve both better privacy preservation and better range query accuracy than random data perturbation alternatives.  相似文献   

18.
为提供更好的用户体验,提出一种考虑压缩降维矩阵分解的差分隐私随机扰动推荐算法。首先,改进了局部差分隐私保护(LDP)的下矩阵分解算法,单个用户将自己的数据随机化以满足不同的隐私,并将受干扰的数据发送到推荐器。然后,推荐者计算扰动数据的聚集,框架确保了用户的项目和评级对推荐者都是私有的。同时为解决LDP应用于矩阵因式分解时存在的数据高维特性,采用了随机投影降维技术,在没有数据先验知识情况下减少用户数据的维度。通过在Last.fm和Flixster测试集上对推荐系统的推荐精度、推荐效率以及参数变化影响进行了实验分析,证明了上述算法在更强的隐私要求下比现有的算法具有更好的矩阵分解性能,验证了算法的有效性。  相似文献   

19.
针对基于随机响应的隐私保护分类挖掘算法仅适用于原始数据属性值是二元的问题,设计了一种适用于多属性值原始数据的隐私保护分类挖掘算法。算法分为两个部分:a)通过比较参数设定值和随机产生数之间的大小,决定是否改变原始数据的顺序,以实现对原始数据进行变换,从而起到保护数据隐私性的目的;b)通过求解信息增益比例的概率估计值,在伪装后的数据上构造决策树。  相似文献   

20.
Data collection is a necessary step in data mining process. Due to privacy reasons, collecting data from different parties becomes difficult. Privacy concerns may prevent the parties from directly sharing the data and some types of information about the data. How multiple parties collaboratively conduct data mining without breaching data privacy presents a challenge. The objective of this paper is to provide solutions for privacy-preserving collaborative data mining problems. In particular, we illustrate how to conduct privacy-preserving naive Bayesian classification which is one of the data mining tasks. To measure the privacy level for privacy- preserving schemes, we propose a definition of privacy and show that our solutions preserve data privacy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号