首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Record linkage aims at finding the matching records from two or multiple different databases. Many approximate string matching methods in privacy-preserving record linkage have been presented. In this paper, we study the problem of secure record linkage between two data files in the two-party computation setting. We note that two records are linked when all the Hamming distances between their attributes are smaller than some fixed thresholds. We firstly construct two efficient secure protocols for computing the powers of an encrypted value and implementing zero test on an encrypted value, then present an efficient protocol within constant rounds for computing the Hamming distance based record linkage in the presence of malicious adversaries by transferring these two protocols. We also discuss the extension of our protocol for settling the Levenshtein distance based record linkage problem.  相似文献   

2.
Large-scale k-means clustering with user-centric privacy-preservation   总被引:1,自引:1,他引:0  
A k-means clustering with a new privacy-preserving concept, user-centric privacy preservation, is presented. In this framework, users can conduct data mining using their private information by storing them in their local storage. After the computation, they obtain only the mining result without disclosing private information to others. In most cases, the number of parties that can join conventional privacy-preserving data mining has been assumed to be only two. In our framework, we assume large numbers of parties join the protocol; therefore, not only scalability but also asynchronism and fault-tolerance is important. Considering this, we propose a k-mean algorithm combined with a decentralized cryptographic protocol and a gossip-based protocol. The computational complexity is O(log n) with respect to the number of parties n, and experimental results show that our protocol is scalable even with one million parties.  相似文献   

3.
This paper presents Araneola (Araneola means “little spider” in Latin.), a scalable reliable application-level multicast system for highly dynamic wide-area environments. Araneola supports multi-point to multi-point reliable communication in a fully distributed manner, while incurring constant load (in terms of message and space complexity) on each node. For a tunable parameter k≥3, Araneola constructs and dynamically maintains a basic overlay structure in which each node’s degree is either k or k+1, and roughly 90% of the nodes have degree k. Empirical evaluation shows that Araneola’s basic overlay achieves three important mathematical properties of k-regular random graphs (i.e., random graphs in which each node has exactly k neighbors) with N nodes: (i) its diameter grows logarithmically with N; (ii) it is generally k-connected; and (iii) it remains highly connected following random removal of linear-size subsets of edges or nodes. The overlay is constructed and maintained at a low cost: each join, leave, or failure is handled locally, and entails the sending of only about 3k messages in total, independent of N. Moreover, this cost decreases as the churn rate increases.The low degree of Araneola’s basic overlay structure allows for allocating plenty of additional bandwidth for specific application needs. In this paper, we give an example for such a need — communicating with nearby nodes; we enhance the basic overlay with additional links chosen according to geographic proximity and available bandwidth. We show that this approach, i.e., a combination of random and nearby links, reduces the number of physical hops messages traverse without hurting the overlay’s robustness, as compared with completely random Araneola overlays (in which all the links are random) with the same average node degree.Given Araneola’s overlay, we sketch out several message dissemination techniques that can be implemented on top of this overlay. We present a full implementation and evaluation of a gossip-based multicast scheme, with up to 10,000 nodes. We show that compared with a (non-overlay-based) gossip-based multicast protocol, gossiping over Araneola achieves substantial improvements in load, reliability, and latency.  相似文献   

4.
When several data owners possess data on different records but the same variables, known as horizontally partitioned data, the owners can improve statistical inferences by sharing their data with each other. Often, however, the owners are unwilling or unable to share because the data are confidential or proprietary. Secure computation protocols enable the owners to compute parameter estimates for some statistical models, including linear regressions, without sharing individual records’ data. A drawback to these techniques is that the model must be specified in advance of initiating the protocol, and the usual exploratory strategies for determining good-fitting models have limited usefulness since the individual records are not shared. In this paper, we present a protocol for secure adaptive regression splines that allows for flexible, semi-automatic regression modeling. This reduces the risk of model mis-specification inherent in secure computation settings. We illustrate the protocol with air pollution data.  相似文献   

5.
When several data owners possess data on different records but the same variables, known as horizontally partitioned data, the owners can improve statistical inferences by sharing their data with each other. Often, however, the owners are unwilling or unable to share because the data are confidential or proprietary. Secure computation protocols enable the owners to compute parameter estimates for some statistical models, including linear regressions, without sharing individual records’ data. A drawback to these techniques is that the model must be specified in advance of initiating the protocol, and the usual exploratory strategies for determining good-fitting models have limited usefulness since the individual records are not shared. In this paper, we present a protocol for secure adaptive regression splines that allows for flexible, semi-automatic regression modeling. This reduces the risk of model mis-specification inherent in secure computation settings. We illustrate the protocol with air pollution data.  相似文献   

6.
For secure communications in public network environments, various three-party authenticated key exchange (3PAKE) protocols are proposed to provide the transaction confidentiality and efficiency. In 2008, Chen et al. proposed a round-efficient 3PAKE protocol to provide the computation and communication efficiency for user authentication and session key exchange. However, we discover that the computation costs and communication loads of their protocol are still high so that it cannot be applied to mobile communications. Therefore, we propose an efficient three-party authenticated key exchange protocol based upon elliptic curve cryptography for mobile-commerce environments. Because the elliptic curve cryptography is used, the proposed 3PAKE protocol has low computation costs and light communication loads. Compared with Chen et al.’s protocol, the proposed protocol is more suitable and practical for mobile-commerce environments.  相似文献   

7.
In many network applications, including distant learning, audio webcasting, video streaming, and online gaming, often a source has to send data to many receivers. IP multicasts and application-layer multicasts provide efficient and scalable one-to-many or many-to-many communications. A common secret key, the group key, shared by multiple users can be used to secure the information transmitted in the multicast communication channel. In this paper, a new group key management protocol is proposed to reduce the communication and computation overhead of group key rekeying caused by membership changes. With shared key derivation, new keys derivable by members themselves do not have to be encrypted or delivered by the server, and the performance of synchronous and asynchronous rekeying operations, including single join, single leave, and batch update, is thus improved. The proposed protocol is shown to be secure and immune to collusion attacks, and it outperforms the other comparable protocols from our analysis and simulation. The protocol is particularly efficient with binary key trees and asynchronous rekeying, and it can be tuned to meet different rekeying delay or key size requirements.  相似文献   

8.
The k Nearest Neighbor (kNN) join operation associates each data object in one data set with its k nearest neighbors from the same or a different data set. The kNN join on high-dimensional data (high-dimensional kNN join) is a very expensive operation. Existing high-dimensional kNN join algorithms were designed for static data sets and therefore cannot handle updates efficiently. In this article, we propose a novel kNN join method, named kNNJoin +, which supports efficient incremental computation of kNN join results with updates on high-dimensional data. As a by-product, our method also provides answers for the reverse kNN queries with very little overhead. We have performed an extensive experimental study. The results show the effectiveness of kNNJoin+ for processing high-dimensional kNN joins in dynamic workloads.  相似文献   

9.

Nowadays, computing on encrypted data seems to be more practical than a few years ago, thanks to the emergence of new Homomorphic Encryption schemes. In this paper, an algorithm based on Homomorphic Encryption for Arithmetic of Approximate Numbers (Cheon et al., in: Takagi, Peyrin (eds) Advances in cryptology—ASIACRYPT 2017, Springer, Cham, pp 409–437, 2017) (HEAAN, or also CKKS) scheme, that is able to perform a secure k-means algorithm which processes encrypted data, has been studied and presented. The performance of the classifier running on encrypted data has been evaluated using a standard k-means algorithm that works on plain data as a supervised structure, since the results are obtained by approximated computations. The main point of this paper is to take existent theoretical techniques (for example approximations of \(\text {sgn}(x)\)), to use them and to observe if they are valid in practical applications. The output of the algorithm is a set of k encrypted masks that can be applied to the original dataset in order to obtain different clusters. The setting is a standard client–server one. The workload is heavily server-centric, as the client only has to execute a light masking algorithm at the end of each iteration, which, excluding the decryption, is faster than a plain k-means iteration; the main disadvantage concerns the accuracy of the results. Experiments show that the algorithm can be executed fairly quickly: the execution time of the training phase is on the order of seconds, while classification is on the order of tenths of a second.

  相似文献   

10.
王洪亚  杨利宏  刘晓强 《软件学报》2016,27(12):3051-3066
相似连接算法在数据清理、数据集成和重复网页检测等领域有着广泛的应用.现有相似连接算法有两种类型:基于相似度阈值的相似连接和Top-k相似连接.Top-k连接算法非常适合于相似度阈值未知的应用场景,目前最为有效的Top-k相似连接算法是Xiao等人提出的Topk-join.为了解决Topk-join中存在的性能问题,提出了一种Top-k相似连接算法Opt-join,该算法将Token批处理技术集成在现有的事件驱动框架中,以降低前缀事件的处理代价;通过置换哈希查找与过滤操作的执行位置来降低哈希查找代价,并理论证明了该置换的正确性.实验结果表明:与Topk-join算法相比,Opt-join取得了1.28倍~3.09倍的性能提升.实验数据还显示:随着数据长度的增加或k值的增长,Opt-join的性能优势有不断增加的趋势.  相似文献   

11.
本文中,我们首先证明了李增鹏等人提出的多比特多密钥全同态加密方案(MFHE)满足密钥同态性质,利用此性质,可以通过门限解密得到最终解密结果.使用该方案,我们设计了一个在CRS模型下和半恶意攻击者模型下安全的三轮多方计算协议(MPC).该安全多方计算协议的安全性是基于容错学习问题(LWE)的两个变种问题Ferr LWE和Some are errorless.LWE,而且,通过非交互的零知识证明,我们可以把半恶意攻击者模型下安全的三轮多方计算协议转变为在恶意模型下安全的三轮多方计算协议.  相似文献   

12.
In most of the auction systems the values of bids are known to the auctioneer. This allows him to manipulate the outcome of the auction. Hence, one might be interested in hiding these values. Some cryptographically secure protocols for electronic auctions have been presented in the last decade. Our work extends these protocols in several ways. On the basis of garbled circuits, i.e., encrypted circuits, we present protocols for sealed-bid auctions that fulfill the following requirements: 1) protocols are information-theoretically t-private for honest but curious parties; 2) the number of bits that can be learned by malicious adversaries is bounded by the output length of the auction; 3) the computational requirements for participating parties are very low: only random bit choices and bitwise computation of the XOR-function are necessary. Note that one can distinguish between the protocol that generates a garbled circuit for an auction and the protocol to evaluate the auction. In this paper we address both problems. We will present a t-private protocol for the construction of a garbled circuit that reaches the lower bound of 2t + 1 parties, and Finally, we address the problem of bid changes in an auction. a more randomness efficient protocol for (t + 1)^2 parties  相似文献   

13.
雷斌  许嘉  谷峪  于戈 《软件学报》2013,24(S2):188-199
以无线传感器网络为代表的新型数据应用和以图像处理为基础的传统数据应用都产生了大规模的概率数据.在概率数据的管理中,Top-k相似性连接操作返回最相似的k 对概率数据,具有重要应用价值.直方图是最常用的概率数据模型之一,而EMD(Earth Mover’s Distance)距离因其较强的鲁棒性可更准确地量化直方图概率数据之间的相似性.然而EMD距离的计算却具有三次方的时间复杂度,给基于EMD距离的Top-k 相似性连接带来巨大挑战.基于流行的MapReduce并行处理框架,利用EMD距离对偶线性规划问题的优良特性,提出了两种大规模概率数据上基于EMD距离的Top-k相似性连接算法.首先提出基于块嵌套循环连接思想的基本解决方法,命名为Top-k BNLJ算法.进而改进数据划分策略,提出基于数据局部性进行数据划分的Top-k DLPJ 算法,有效降低了MapReduce作业执行过程中的数据传输量.使用大规模真实数据集对两种算法进行评估,证实了本文提出的Top-k DLPJ算法的高效性和处理大规模数据集时的良好扩展性.  相似文献   

14.
The similarity join has become an important database primitive for supporting similarity searches and data mining. A similarity join combines two sets of complex objects such that the result contains all pairs of similar objects. Two types of the similarity join are well-known, the distance range join, in which the user defines a distance threshold for the join, and the closest pair query or k-distance join, which retrieves the k most similar pairs. In this paper, we propose an important, third similarity join operation called the k-nearest neighbour join, which combines each point of one point set with its k nearest neighbours in the other set. We discover that many standard algorithms of Knowledge Discovery in Databases (KDD) such as k-means and k-medoid clustering, nearest neighbour classification, data cleansing, postprocessing of sampling-based data mining, etc. can be implemented on top of the k-nn join operation to achieve performance improvements without affecting the quality of the result of these algorithms. We propose a new algorithm to compute the k-nearest neighbour join using the multipage index (MuX), a specialised index structure for the similarity join. To reduce both CPU and I/O costs, we develop optimal loading and processing strategies.  相似文献   

15.
Secure multidimensional range queries over outsourced data   总被引:1,自引:0,他引:1  
In this paper, we study the problem of supporting multidimensional range queries on encrypted data. The problem is motivated by secure data outsourcing applications where a client may store his/her data on a remote server in encrypted form and want to execute queries using server??s computational capabilities. The solution approach is to compute a secure indexing tag of the data by applying bucketization (a generic form of data partitioning) which prevents the server from learning exact values but still allows it to check if a record satisfies the query predicate. Queries are evaluated in an approximate manner where the returned set of records may contain some false positives. These records then need to be weeded out by the client which comprises the computational overhead of our scheme. We develop a bucketization procedure for answering multidimensional range queries on multidimensional data. For a given bucketization scheme, we derive cost and disclosure-risk metrics that estimate client??s computational overhead and disclosure risk respectively. Given a multidimensional dataset, its bucketization is posed as an optimization problem where the goal is to minimize the risk of disclosure while keeping query cost (client??s computational overhead) below a certain user-specified threshold value. We provide a tunable data bucketization algorithm that allows the data owner to control the trade-off between disclosure risk and cost. We also study the trade-off characteristics through an extensive set of experiments on real and synthetic data.  相似文献   

16.
In recent years, there have been numerous attempts to extend the k-means clustering protocol for single database to a distributed multiple database setting and meanwhile keep privacy of each data site. Current solutions for (whether two or more) multiparty k-means clustering, built on one or more secure two-party computation algorithms, are not equally contributory, in other words, each party does not equally contribute to k-means clustering. This may lead a perfidious attack where a party who learns the outcome prior to other parties tells a lie of the outcome to other parties. In this paper, we present an equally contributory multiparty k-means clustering protocol for vertically partitioned data, in which each party equally contributes to k-means clustering. Our protocol is built on ElGamal's encryption scheme, Jakobsson and Juels's plaintext equivalence test protocol, and mix networks, and protects privacy in terms that each iteration of k-means clustering can be performed without revealing the intermediate values.  相似文献   

17.
一种基于CPK的传输协议   总被引:1,自引:0,他引:1       下载免费PDF全文
为提高VxWorks中数据传输的效率,结合组合公钥算法的原理和加密通信方法,设计一种高效、安全的嵌入式安全传输(EST)协议。EST协议能够在VxWorks环境下快速建立端到端的通信,实现保密通信,满足实时操作系统的安全需求。给出该协议的设计与实现及相应结果分析,证明了该协议的可行性、有效性。  相似文献   

18.
为高效利用交通资源,在线网约出行(ORH)服务整合车辆供给和乘客请求信息,派遣符合条件的车辆提供非巡游的出行服务。人们在享受ORH服务带来的便利时,也面临着严重的隐私泄露风险。为此,许多研究利用密码学技术设计隐私保护的ORH服务。首先,本文介绍了隐私保护的ORH服务主要面临的城市动态场景下高效计算密态行程开销、实时动态规划密态行程、安全共享不同ORH服务的运力资源等挑战。然后,回顾了欧式距离、路网距离和行驶时间三类行程开销的安全计算方法,其中,欧式距离计算效率高,但误差大,现有路网距离和行驶时长的安全计算方法多数面向静态路网场景,针对城市动态路网场景的安全计算方法有待进一步研究。分析了面向司机、乘客、ORH平台的行程规划问题的求解方法,现有研究往往仅针对司机、乘客或ORH平台的单一目标进行行程规划,事实上行程规划不但要考虑ORH平台自身收益,更要同时兼顾乘客和司机的用户体验。综述了隐私感知的行程预处理方法,单车单客模式、单车多客模式的行程安全共享方法,并总结了其不足与启示。多车单客、多车多客动态模式的行程安全共享有待进一步研究。最后,从城市动态路网下高效的密态行程开销的安全计算与比较、...  相似文献   

19.

The accuracy and performance of deep neural network models become important issues as the applications of deep learning increase. For example, the navigation system of autonomous self-driving vehicles requires very accurate deep learning models. If a self-driving car fails to detect a pedestrian in bad weather, the result can be devastating. If we can increase the model accuracy by increasing the training data, the probability of avoiding such scenarios increases significantly. However, the problem of privacy for consumers and lack of enthusiasm for sharing their personal data, e.g., the recordings of their self-driving car, is an obstacle for using this valuable data. In Blockchain technology, many entities which cannot trust each other in normal conditions can join together to achieve a mutual goal. In this paper, a secure decentralized peer-to-peer framework for training the deep neural network models based on the distributed ledger technology in Blockchain ecosystem is proposed. The proposed framework anonymizes the identity of data providers and therefore can be used as an incentive for consumers to share their private data for training deep learning models. The proposed framework uses the Stellar Blockchain infrastructure for secure decentralized training of the deep models. A deep learning coin is proposed for Blockchain compensation.

  相似文献   

20.
With the prevalence of cloud computing, data owners are motivated to outsource their databases to the cloud server. However, to preserve data privacy, sensitive private data have to be encrypted before outsourcing, which makes data utilization a very challenging task. Existing work either focus on keyword searches and single-dimensional range query, or suffer from inadequate security guarantees and inefficiency. In this paper, we consider the problem of multidimensional private range queries over encrypted cloud data. To solve the problem, we systematically establish a set of privacy requirements for multidimensional private range queries, and propose a multidimensional private range query (MPRQ) framework based on private block retrieval (PBR), in which data owners keep the query private from the cloud server. To achieve both efficiency and privacy goals, we present an efficient and fully privacy-preserving private range query (PPRQ) protocol by using batch codes and multiplication avoiding technique. To our best knowledge, PPRQ is the first to protect the query, access pattern and single-dimensional privacy simultaneously while achieving efficient range queries. Moreover, PPRQ is secure in the sense of cryptography against semi-honest adversaries. Experiments on real-world datasets show that the computation and communication overhead of PPRQ is modest.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号