首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 156 毫秒
1.
随着数据挖掘技术的发展有关数据挖掘的个人隐私保护越来越受到关注.如何在保护隐私的情况下挖掘出有用的信息是近年来数据挖掘的研究趋势之一,为了保护个人隐私信息,我们首先对数据进行随机化的处理,然后在此基础上对数据进行分析,挖掘.本文介绍了隐私保护的发展原因,随机化处理方法及其它关于隐私保护数据挖掘的算法.  相似文献   

2.
数据挖掘中隐私保护的随机化处理方法   总被引:6,自引:0,他引:6       下载免费PDF全文
数据挖掘中主要的任务就是针对聚集数据的建模问题。目前数据挖掘中的个人隐私保护问题受到越来越多的重视和研究。为了保护个人隐私,我们首先对一些私有数据进行随机化处理,在此基础上再进行建模。本文介绍了隐私保护课题的发展、随机化处理方法的一般算法及隐私保护技术的发展前景。  相似文献   

3.
差分隐私保护及其应用   总被引:3,自引:0,他引:3  
数据发布与数据挖掘中的隐私保护问题是目前信息安全领域的一个研究热点.作为一种严格的和可证明的隐私定义,差分隐私近年来受到了极大关注并被广泛研究.文中分析了差分隐私保护模型相对于传统安全模型的优势,对差分隐私基础理论及其在数据发布与数据挖掘中的应用研究进行综述.在数据发布方面,介绍了各种交互式和非交互式的差分隐私保护发布方法,并着重从精确度和样本复杂度的角度对这些方法进行了比较.在数据挖掘方面,阐述了差分隐私保护数据挖掘算法在接口模式和完全访问模式下的实现方式,并对这些算法的执行性能进行了分析.最后,介绍了差分隐私保护在其它领域的应用,并展望未来的研究方向.  相似文献   

4.
介绍了隐私保护数据挖掘方法的产生背景和意义,其次概括了现阶段国内外隐私保护数据挖掘算法的研究现状,并对当前隐私保护数据挖掘领域中已提出的算法按照数据挖掘的方法、数据源分布情况、隐私保护技术和隐私保护对象以及数据挖掘应用类型等方面进行分类,然后分别详细阐述了在集中式和分布式数据分布环境下,应用在隐私保护的关联规则挖掘、分类和聚类挖掘中的一些典型的技术和算法,总结出它们的优缺点,并对这些优缺点进行剖析和对比,最后指明了隐私保护数据挖掘算法在未来的整体发展方向.  相似文献   

5.
面向数据库应用的隐私保护研究综述   总被引:36,自引:3,他引:36  
随着数据挖掘和数据发布等数据库应用的出现与发展,如何保护隐私数据和防止敏感信息泄露成为当前面临的重大挑战.隐私保护技术需要在保护数据隐私的同时不影响数据应用.根据采用技术的不同,出现了数据失真、数据加密、限制发布等隐私保护技术.文中对隐私保护领域已有研究成果进行了总结,对各类隐私保护技术的基本原理、特点进行了阐述,还详细介绍了各类技术的典型应用,并重点介绍了当前该领域的研究热点:基于数据匿名化的隐私保护技术.在对已有技术深入对比分析的基础上,指出了隐私保护技术的未来发展方向.  相似文献   

6.
由于云计算的诸多优势,用户倾向于将数据挖掘和数据分析等业务外包到专业的云服务提供商,然而随之而来的是用户的隐私不能得到保证.目前,众多学者关注云环境下敏感数据存储的隐私保护,而隐私保护数据分析的相关研究还比较少.但是如果仅仅为了保护数据隐私,而不对大数据进行挖掘分析,大数据也就失去了其潜在的巨大价值.本文提出了一种云计算环境下基于格的隐私保护数据发布方法,利用格加密构建隐私数据的安全同态运算方法,并且在此基础上实现了支持隐私保护的云端密文数据聚类分析数据挖掘服务.为保护用户数据隐私,用户将数据加密之后发布到云服务提供商,云服务提供商利用基于格的同态加密算法实现隐私保护的k-means、隐私保护层次聚类以及隐私保护DBSCAN数据挖掘服务,但云服务提供商并不能直接访问用户数据破坏用户隐私.与现有的隐私数据发布方法相比,论文的隐私数据发布基于格的最接近向量困难问题(CVP)和最短向量困难问题(SVP),具有很高的安全性.同时算法有效保持了密文数据间距离的精确性,与现有研究相比挖掘结果也具有更高的精确性和可用性.论文对方法的安全性进行了理论分析并设计实验对提出的隐私保护数据挖掘方法效率进行评估,实验结果表明本文提出的基于格的隐私保护数据挖掘算法与现有的方法相比具有更高的数据分析精确性和更高的计算效率.  相似文献   

7.
无线传感器网络数据隐私保护技术   总被引:13,自引:0,他引:13  
范永健  陈红  张晓莹 《计算机学报》2012,35(6):1131-1146
研究和解决数据隐私保护问题对无线传感器网络的大规模应用具有重要意义,同时无线传感器网络的特征使得数据隐私保护技术面临严重挑战.目前无线传感器网络数据隐私保护技术已成为研究热点,主要针对数据聚集、数据查询和访问控制中数据隐私保护问题进行了研究.文中对无线传感器网络数据隐私保护现有研究成果进行了总结,从数据操作任务和隐私保护实现技术两个维度对现有研究成果进行了分类,介绍了网络模型、攻击模型和安全目标,阐述了代表性协议的关键实现技术,分析和比较了代表性协议的性能并总结了各协议的主要优缺点,最后指出了未来的研究方向.  相似文献   

8.
伴随着计算机网络技术与无线通信技术的蓬勃发展,包括数据挖掘以及数据发布在内的多种数据库关联性应用功能的构建及发展成为了相关工作人员作为关注的问题之一。在当前技术条件支持之下,如何以数据库应用为中心,针对数据库系统所涉及到的各类隐私数据进行系统保护并防止数据库系统中的敏感信息发生外泄,是相关工作人员应当研究的重点。我们清楚的认识到:现阶段能够充分发挥数据库隐私保护职能的技术方针可以分为以下几大部分—1.数据失真技术;2.数据加密技术;3.限制保护技术。以上三种面向数据库应用隐私保护技术方针最大的优势在于其在确保隐私保护职能发挥的同时兼顾了整个数据库系统的正常、稳定运行,这一点需要我们加以认同与肯定。本文依据这一实际情况,以新时期面向数据库应用的隐私保护问题为研究对象,从隐私保护研究方向及研究现状分析、隐私保护技术分类分析以及面向数据库应用的隐私保护技术分析这三个方面入手,对其进行了较为详细的分析与阐述,并据此论证了做好数据库隐私保护工作在进一步提升数据库应用质量与应用效率的过程中所起到的至关重要的作用与意义。  相似文献   

9.
随着社会信息化和电子商务与电子政务的不断发展,数据成为社会的重要资源,数据挖掘技术的应用逐渐深入。与此同时,隐私保护方面的问题已经成为数据挖掘研究的热点问题之一。本文介绍了数据挖掘隐私保护的发展现状,阐述了相关的概念、特征、分类和研究成果,并从数据扰动和多方安全计算两个方面介绍了数据挖掘隐私保护的相关技术,提出了未来的研究方向。  相似文献   

10.
信息技术和医疗健康信息化的不断发展使医疗数据大规模涌现,为数据分析、数据挖掘、智能诊断等更深层次的应用提供了条件.医疗数据集庞大且涉及大量病人隐私,如何在使用医疗数据的同时保护病人隐私极具挑战性.目前应用于医疗领域的隐私保护技术主要以匿名化技术为主,但当攻击者具有强大的背景知识时,此类方法无法兼顾数据集的隐私性和可用性...  相似文献   

11.
Imagine numerous clients, each with personal data; individual inputs are severely corrupt, and a server only concerns the collective, statistically essential facets of this data. In several data mining methods, privacy has become highly critical. As a result, various privacy-preserving data analysis technologies have emerged. Hence, we use the randomization process to reconstruct composite data attributes accurately. Also, we use privacy measures to estimate how much deception is required to guarantee privacy. There are several viable privacy protections; however, determining which one is the best is still a work in progress. This paper discusses the difficulty of measuring privacy while also offering numerous random sampling procedures and statistical and categorized data results. Furthermore, this paper investigates the use of arbitrary nature with perturbations in privacy preservation. According to the research, arbitrary objects (most notably random matrices) have "predicted" frequency patterns. It shows how to recover crucial information from a sample damaged by a random number using an arbitrary lattice spectral selection strategy. This filtration system's conceptual framework posits, and extensive practical findings indicate that sparse data distortions preserve relatively modest privacy protection in various situations. As a result, the research framework is efficient and effective in maintaining data privacy and security.  相似文献   

12.
李光  王亚东  苏小红 《计算机工程》2012,38(3):12-13,18
在现有的基于数据扰动的隐私保持分类挖掘算法中,扰动数据和原始数据相关联,对隐私数据的保护并不完善,且扰动算法和分类算法耦合度高,不适合在实际中使用。为此,提出一种基于概率论的隐私保持分类挖掘算法。扰动后可得到一组与原始数据独立同分布的数据,使扰动数据和原始数据不再相互关联,各种分类算法也可直接应用于扰动后的数据。  相似文献   

13.
Data collection is a necessary step in data mining process. Due to privacy reasons, collecting data from different parties becomes difficult. Privacy concerns may prevent the parties from directly sharing the data and some types of information about the data. How multiple parties collaboratively conduct data mining without breaching data privacy presents a challenge. The objective of this paper is to provide solutions for privacy-preserving collaborative data mining problems. In particular, we illustrate how to conduct privacy-preserving naive Bayesian classification which is one of the data mining tasks. To measure the privacy level for privacy- preserving schemes, we propose a definition of privacy and show that our solutions preserve data privacy.  相似文献   

14.
Over the last decade, privacy has been widely recognised as one of the major problems of data collections in general and the Web in particular. This concerns specifically data arising from Web usage (such as querying or transacting) and social networking (characterised by rich self-profiling including relational information) and the inferences drawn from them. The data mining community has been very conscious of these issues and has addressed in particular the inference problems through various methods for “privacy-preserving data mining” and “privacy-preserving data publishing”. However, it appears that these approaches by themselves cannot effectively solve the privacy problems posed by mining. We argue that this is due to the underlying notions of privacy and of data mining, both of which are too narrow. Drawing on notions of privacy not only as hiding, but as control and negotiation, as well as on data mining not only as modelling, but as the whole cycle of knowledge discovery, we offer an alternative view. This is intended to be a comprehensive view of the privacy challenges as well as solution approaches along all phases of the knowledge discovery cycle. The paper thus combines a survey with an outline of an agenda for a comprehensive, interdisciplinary view of Web mining and privacy.  相似文献   

15.
针对基于随机响应的隐私保护分类挖掘算法仅适用于原始数据属性值是二元的问题,设计了一种适用于多属性值原始数据的隐私保护分类挖掘算法。算法分为两个部分:a)通过比较参数设定值和随机产生数之间的大小,决定是否改变原始数据的顺序,以实现对原始数据进行变换,从而起到保护数据隐私性的目的;b)通过求解信息增益比例的概率估计值,在伪装后的数据上构造决策树。  相似文献   

16.
This paper proposes a scalable, local privacy-preserving algorithm for distributed Peer-to-Peer (P2P) data aggregation useful for many advanced data mining/analysis tasks such as average/sum computation, decision tree induction, feature selection, and more. Unlike most multi-party privacy-preserving data mining algorithms, this approach works in an asynchronous manner through local interactions and it is highly scalable. It particularly deals with the distributed computation of the sum of a set of numbers stored at different peers in a P2P network in the context of a P2P web mining application. The proposed optimization-based privacy-preserving technique for computing the sum allows different peers to specify different privacy requirements without having to adhere to a global set of parameters for the chosen privacy model. Since distributed sum computation is a frequently used primitive, the proposed approach is likely to have significant impact on many data mining tasks such as multi-party privacy-preserving clustering, frequent itemset mining, and statistical aggregate computation.  相似文献   

17.
With the proliferation of healthcare data, the cloud mining technology for E-health services and applications has become a hot research topic. While on the other hand, these rapidly evolving cloud mining technologies and their deployment in healthcare systems also pose potential threats to patient’s data privacy. In order to solve the privacy problem in the cloud mining technique, this paper proposes a semi-supervised privacy-preserving clustering algorithm. By employing a small amount of supervised information, the method first learns a Large Margin Nearest Cluster metric using convex optimization. Then according to the trained metric, the method imposes multiplicative perturbation on the original data, which can change the distribution shape of the original data and thus protect the privacy information as well as ensuring high data usability. The experimental results on the brain fiber dataset provided by the 2009 PBC demonstrated that the proposed method could not only protect data privacy towards secure attacks, but improve the clustering purity.  相似文献   

18.
针对现有的基于非负矩阵分解的隐私保护数据挖掘方法中,不区分样本的重要性的不同,对所有样本都进行同样强度扰动的问题进行改进。提出了一种结合样本选择的基于非负矩阵分解的隐私保护分类方法。该方法使用样本选择将原始样本区分为重要的和不重要的两类。在对数据进行扰动时,使用现有的基于非负矩阵分解的方法对所有样本进行扰动。随后利用非负矩阵分解的聚类性质,对不重要的样本进行附加扰动。实验表明,该方法在保持数据可用性的同时,可以对隐私信息提供更好的保护。  相似文献   

19.
The increasing availability of personal data of a sequential nature, such as time-stamped transaction or location data, enables increasingly sophisticated sequential pattern mining techniques. However, privacy is at risk if it is possible to reconstruct the identity of individuals from sequential data. Therefore, it is important to develop privacy-preserving techniques that support publishing of really anonymous data, without altering the analysis results significantly. In this paper we propose to apply the Privacy-by-design paradigm for designing a technological framework to counter the threats of undesirable, unlawful effects of privacy violation on sequence data, without obstructing the knowledge discovery opportunities of data mining technologies. First, we introduce a k-anonymity framework for sequence data, by defining the sequence linking attack model and its associated countermeasure, a k-anonymity notion for sequence datasets, which provides a formal protection against the attack. Second, we instantiate this framework and provide a specific method for constructing the k-anonymous version of a sequence dataset, which preserves the results of sequential pattern mining, together with several basic statistics and other analytical properties of the original data, including the clustering structure. A comprehensive experimental study on realistic datasets of process-logs, web-logs and GPS tracks is carried out, which empirically shows how, in our proposed method, the protection of privacy meets analytical utility.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号