首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Heterogeneous Euclidean-overlap metric and heterogeneous value difference metric given in machine learning literature are useful for the consideration of mixed-type data for machine learning, pattern recognition and data mining tasks. Mixed-type variables are quite common in practical problems, but this property has been taken into account only seldom in pattern recognition, data mining and decision making algorithms. We observed that these two distance measures are not actually metrics after having found a special situation when they are not metric, but pseudometric, a feature to be noted while using them. Nevertheless, by changing their definitions somewhat, it is possible to meet the metricity. Especially in medical applications, the redefinition of the two measures might be important, since otherwise it is possible in theory that, for example, two identical cases would be classified differently. Nearest neighbor searching tests with medical data were run to illustrate the behavior of these measures. Notwithstanding the violation of the metricity their original forms yielded slightly better classification results. The reason was that in real data sets tested there were very few almost similar cases according to these distance measures, and the original forms based on more separating distances than the redefinitions were slightly better in the classification.  相似文献   

2.
研究针对序列模式有关隐私保护议题,提出有效的SDRF序列模式隐藏算法,让分享序列模式时也能保护自己的核心信息。  相似文献   

3.
Privacy preserving data mining algorithms are proposed to protect the participating parties’ data privacy in data mining processes. So far, most of these algorithms only work in the semi-honest model that assumes all parties follow the algorithms honestly. In this paper, we propose two privacy preserving perceptron learning algorithms in the malicious model, for horizontally and vertically partitioned data sets, respectively. So far as we know, our algorithms are the first perceptron learning algorithms that can protect data privacy in the malicious model.  相似文献   

4.
分布式数据库关联规则的安全挖掘算法研究   总被引:1,自引:0,他引:1  
分布式环境中,进行分布式数据库关联规则的挖掘而不泄露用户的隐私,是非常重要的问题.提出了分布式数据库的关联规则的安全挖掘算法PPDMA(Privacy Preserving Distributed Mining Algorithms),通过应用密码学方法对站点间传送的用于挖掘全局频繁项集的被约束子树及其它信息进行加密,而在接受站点对加密信息进行解密,达到不披露用户信息,起到保护用户隐私的作用,以进行关联规则的安全挖掘.分析表明,该算法是正确可行的.  相似文献   

5.
6.

Cloud computing and the efficient storage provide new paradigms and approaches designed at efficiently utilization of resources through computation and many alternatives to guarantee the privacy preservation of individual user. It also ensures the integrity of stored cloud data, and processing of stored data in the various data centers. However, to provide better protection and management of sensitive information (data) are big challenge to maintain the confidentiality and integrity of data in the cloud computation. Thus, there is an urgent need for storing and processing the data in the cloud environment without any information leakage. The sensitive data require the storing and processing mechanism and techniques to assurance the privacy preservation of individual user, to maintain the data integrity, and preserve confidentiality. Face recognition has recently achieved advancements in the unobtrusive recognition of individuals to maintain the privacy-preservation in the cloud computing. This paper emphasizes on cloud security and privacy issues and provides the solution using biometric face recognition. We propose a biometrics face recognition approach for security and privacy preservation of cloud users during their access to cloud resources. The proposed approach has three steps: (1) acquisition of face images (2) preprocessing and extraction of facial feature (3) recognition of individual using encrypted biometric feature. The experimental results establish that our proposed recognition approach can ensure the privacy and security of biometrics data.

  相似文献   

7.
8.
In recent years, classification learning for data streams has become an important and active research topic. A major challenge posed by data streams is that their underlying concepts can change over time, which requires current classifiers to be revised accordingly and timely. To detect concept change, a common methodology is to observe the online classification accuracy. If accuracy drops below some threshold value, a concept change is deemed to have taken place. An implicit assumption behind this methodology is that any drop in classification accuracy can be interpreted as a symptom of concept change. Unfortunately however, this assumption is often violated in the real world where data streams carry noise that can also introduce a significant reduction in classification accuracy. To compound this problem, traditional noise cleansing methods are incompetent for data streams. Those methods normally need to scan data multiple times whereas learning for data streams can only afford one-pass scan because of data’s high speed and huge volume. Another open problem in data stream classification is how to deal with missing values. When new instances containing missing values arrive, how a learning model classifies them and how the learning model updates itself according to them is an issue whose solution is far from being explored. To solve these problems, this paper proposes a novel classification algorithm, flexible decision tree (FlexDT), which extends fuzzy logic to data stream classification. The advantages are three-fold. First, FlexDT offers a flexible structure to effectively and efficiently handle concept change. Second, FlexDT is robust to noise. Hence it can prevent noise from interfering with classification accuracy, and accuracy drop can be safely attributed to concept change. Third, it deals with missing values in an elegant way. Extensive evaluations are conducted to compare FlexDT with representative existing data stream classification algorithms using a large suite of data streams and various statistical tests. Experimental results suggest that FlexDT offers a significant benefit to data stream classification in real-world scenarios where concept change, noise and missing values coexist.  相似文献   

9.
限制隐私泄露的隐私保护聚类算法   总被引:1,自引:0,他引:1  
为了解决在极端情况下数据挖掘中隐私泄露的问题,分析了在数据聚类时增加Laplace噪音可以避免隐私泄露的原理,结合主成份分析与噪音扰动方法,提出了一种限制隐私泄露的隐私保护聚类算法.该算法利用主成份分析除掉了数据的相关性,将Laplace噪音加入数据的主成份向量中,然后计算被扰动的数据之间距离变化值,这样可以避免扰动后的数据被还原,以达到在隐私保护聚类挖掘中限制隐私泄露的目的.仿真实验结果表明,该算法对于数据聚类时限制隐私泄露是正确有效的.  相似文献   

10.
保护私有信息的计算几何是一类特殊的安全多方计算问题,在军事、商业等领域具有重要的应用前景。在半诚实模型下,利用点线叉积协议设计一个保护私有信息的点包含于多边形判定协议;基于该协议,提出保护私有信息的两多边形相交面积计算协议;分析和证明上述协议的正确性、安全性和复杂性。  相似文献   

11.
符燕华  顾嗣扬 《计算机应用》2006,26(1):213-0215
利用数量积方法从垂直型分布数据中挖掘关联规则,并且保持其隐私性。给出了数量积算法,分析其安全性,同时还举例说明如何利用数量积算法进行垂直型分布式数据挖掘。  相似文献   

12.
针对垂直分布下的隐私保护关联规则挖掘算法安全性不高和挖掘效率较低的问题,提出了一种隐私保护关联规则挖掘算法.算法采用一种新的点积协议,通过引入逆矩阵和随机数隐藏原始输入信息,具有较好的安全性;利用挖掘最大频繁项集来代替挖掘所有频繁项集,采用深度优先遍历策略,结合各种剪枝策略,明显加快了频繁项集的生成速度,大大减少计算代价.实验结果表明,挖掘效率得到了很大提高.  相似文献   

13.
隐私保护数据发布是近年来研究的热点技术之一,主要研究如何在数据发布中避免敏感数据的泄露,又能保证数据发布的高效用性。基于模糊集的隐私保护模型,文中方法首先计算训练样本数据的先验概率,然后通过将单个敏感属性和两个相关联属性基于贝叶斯分类泛化实现隐私保护。通过实验验证基于模糊集的隐私保护模型(Fuzzy k-匿名)比经典隐私保护k-匿名模型具有更高的效率,隐私保护度高,数据可用性强。  相似文献   

14.
随着城市智能化发展,室内定位技术已成为各类位置服务的重要应用基础。在一些室内应用场景中,服务器端需要在保护用户位置隐私的前提下,完成特定区域的用户访问统计。为此,提出了一种基于布隆过滤器和Paillier同态加密的多级敏感区域室内定位算法,旨在保护用户位置隐私的同时服务器能判断用户是否进入特定区域。算法根据区域的类别或敏感级别对室内进行划分,利用Paillier算法对服务器端和用户端的数据进行加密,设计了一种改进的基于布隆过滤器的算法在密文域完成用户位置的判定,减少了加密运算带来的巨大通信开销与计算开销。在公共数据集上的实验结果表明,与已有的空间布隆过滤器算法相比,提出的哈希数组合并算法在同样的通信和计算开销时具有更低的误判率,也可扩展至其他应用中实现多类数据集的编码。  相似文献   

15.
马敏耀  徐艺  刘卓 《计算机应用》2019,39(9):2636-2640
DNA序列承载着人体重要的生物学信息,如何在保护隐私的情况下正确地对不同的DNA序列进行比对,成为亟待研究的科学问题。汉明距离在一定程度上刻画了两个DNA序列的相似程度,在保护隐私的情况下,研究DNA序列的汉明距离计算问题。首先定义了DNA序列的0-1编码规则,该规则将长度为n的DNA序列编码成长度为4n的0-1串,证明了两个DNA序列的汉明距离等于它们的0-1编码串的汉明距离的一半。以此结论为基础,以GM加密算法为主要密码学工具,构造了计算DNA序列汉明距离的一个安全两方计算协议。在半诚实攻击者模型下,证明了协议的正确性,给出了基于模拟器的安全性证明,并对协议的效率进行了分析。  相似文献   

16.
基于SMC协议的分布式聚类分析隐私保护的研究   总被引:1,自引:0,他引:1  
针对基于欧几里得距离的聚类分析隐私保护问题,提出了一种新的隐私保护方法.该方法将安全多方计算协议运用于水平分布和垂直分布两种数据模型上,使得对该两种数据模型进行聚类分析时既满足了保护隐私的前提,又保证了数据间欧几里得距离不变(即挖掘结果的准确性).  相似文献   

17.
数据发布中面向多敏感属性的隐私保护技术*   总被引:1,自引:0,他引:1  
针对多敏感属性数据发布中存在的隐私泄露问题,在分析多维桶分组技术的基础上,继承了基于有损连接对隐私数据进行保护的思想,提出了一种(g,l)-分组方法,首先对多敏感属性根据各自的敏感度进行分组,然后将分组数作为多维桶的各个维的维数。同时还给出了2种不同的线性时间的分组算法:一般(g,l)-分组算法(GGLG)和最大敏感度优先算法(MSF)。实际数据集上的大量实验结果表明,该方法可以明显地减少隐私泄露,增强数据发布的安全性。  相似文献   

18.
The process of identifying which records in two or more databases correspond to the same entity is an important aspect of data quality activities such as data pre-processing and data integration. Known as record linkage, data matching or entity resolution, this process has attracted interest from researchers in fields such as databases and data warehousing, data mining, information systems, and machine learning. Record linkage has various challenges, including scalability to large databases, accurate matching and classification, and privacy and confidentiality. The latter challenge arises because commonly personal identifying data, such as names, addresses and dates of birth of individuals, are used in the linkage process. When databases are linked across organizations, the issue of how to protect the privacy and confidentiality of such sensitive information is crucial to successful application of record linkage.  相似文献   

19.
崔炜荣  杜承烈 《计算机应用》2018,38(4):1051-1057
针对社交网络中用户属性匹配的隐私保护问题,提出一种可保护隐私的用户属性匹配方法。该方法基于匿名属性加密技术构建,可应用于集中式属性匹配场景中。在该方法中,用户用两个属性列表分别表示自我描述和交友偏好,并通过将自我描述转化为属性密钥以及将交友偏好转化为密文访问控制策略来实现属性信息的隐藏。服务器通过判断是否能够正确解密进行匹配判定。运用该方法,服务器可以在不必了解匹配双方具体属性信息的情况下完成双向属性匹配判定。分析和实验结果表明,在保证隐私安全性的同时,该方法也具备较高的计算效率,具有较强的实用性。  相似文献   

20.
Decentralized probabilistic reasoning, constraint reasoning, and decision theoretic reasoning are some essential tasks of cooperative multiagent systems. Several frameworks for these tasks organize agents into a junction tree (JT). We show that existing techniques for JT existence recognition and construction leak information on private variables, shared variables, agent identities and adjacency, that can potentially be protected. We present a scheme to quantify these privacy losses. We develop two novel algorithms for JT existence recognition and for JT construction when existing, that provide strong guarantee of agent privacy. Our experimental comparison shows that the proposed algorithms out-perform existing techniques, one of them having the lowest privacy loss and the other having no privacy loss, while being more efficient than most alternatives.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号