首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
The increasing availability of personal data of a sequential nature, such as time-stamped transaction or location data, enables increasingly sophisticated sequential pattern mining techniques. However, privacy is at risk if it is possible to reconstruct the identity of individuals from sequential data. Therefore, it is important to develop privacy-preserving techniques that support publishing of really anonymous data, without altering the analysis results significantly. In this paper we propose to apply the Privacy-by-design paradigm for designing a technological framework to counter the threats of undesirable, unlawful effects of privacy violation on sequence data, without obstructing the knowledge discovery opportunities of data mining technologies. First, we introduce a k-anonymity framework for sequence data, by defining the sequence linking attack model and its associated countermeasure, a k-anonymity notion for sequence datasets, which provides a formal protection against the attack. Second, we instantiate this framework and provide a specific method for constructing the k-anonymous version of a sequence dataset, which preserves the results of sequential pattern mining, together with several basic statistics and other analytical properties of the original data, including the clustering structure. A comprehensive experimental study on realistic datasets of process-logs, web-logs and GPS tracks is carried out, which empirically shows how, in our proposed method, the protection of privacy meets analytical utility.  相似文献   

3.
近年来,随着无线通信技术的迅猛发展,推动了基于位置服务(Location-based services,LBS)的发展进程.而其中兴趣点(Point of Interest,POI)查询是基于位置服务最重要的应用之一.针对在路网环境下,用户查询过程中位置隐私泄露的问题,提出了一种新的位置k匿名隐私保护方法.首先,匿名服务器将兴趣点作为种子节点生成网络Voronoi图,将整个路网划分为相互独立且不重叠的网络Voronoi单元(Network Voronoi Cell,NVC);其次,利用Hilbert曲线遍历路网空间,并按照Hilbert顺序,对路网上所有的兴趣点进行排序.当用户发起查询时,提出的匿名算法通过查找与用户所在NVC的查询频率相同且位置分散的k-1个NVC,并根据用户的相对位置在NVC内生成匿名位置,从而保证了生成的匿名集中位置之间的相互性,克服了传统k-匿名不能抵御推断攻击的缺陷.最后,理论分析和实验结果表明本文提出的隐私保护方案,能有效保护用户位置隐私.  相似文献   

4.
Users are vulnerable to privacy risks when providing their location information to location-based services (LBS). Existing work sacrifices the quality of LBS by degrading spatial and temporal accuracy ...  相似文献   

5.
Recently, more and more social network data have been published in one way or another. Preserving privacy in publishing social network data becomes an important concern. With some local knowledge about individuals in a social network, an adversary may attack the privacy of some victims easily. Unfortunately, most of the previous studies on privacy preservation data publishing can deal with relational data only, and cannot be applied to social network data. In this paper, we take an initiative toward preserving privacy in social network data. Specifically, we identify an essential type of privacy attacks: neighborhood attacks. If an adversary has some knowledge about the neighbors of a target victim and the relationship among the neighbors, the victim may be re-identified from a social network even if the victim’s identity is preserved using the conventional anonymization techniques. To protect privacy against neighborhood attacks, we extend the conventional k-anonymity and l-diversity models from relational data to social network data. We show that the problems of computing optimal k-anonymous and l-diverse social networks are NP-hard. We develop practical solutions to the problems. The empirical study indicates that the anonymized social network data by our methods can still be used to answer aggregate network queries with high accuracy.  相似文献   

6.
Varieties of sensitive personal information become a privacy concern for social networks. However, characteristics of social graphs could be utilized by attackers to re-identify target entities of social networks. In this paper, we first analyze a new attack model named bin-based attack, which re-identifies social individuals in social networks, according to their graph structure characteristics. For bin-based attack, we propose a novel k-anonymity scheme. With this scheme, social individuals are completely k-anonymity protection. Experiments illustrate the effectiveness of the proposed scheme. The utility of anonymized networks are demonstrated with the results of vertex degree, and betweenness.  相似文献   

7.
k-Anonymity is a method for providing privacy protection by ensuring that data cannot be traced to an individual. In a k-anonymous dataset, any identifying information occurs in at least k tuples. To achieve optimal and practical k-anonymity, recently, many different kinds of algorithms with various assumptions and restrictions have been proposed with different metrics to measure quality. This paper evaluates a family of clustering-based algorithms that are more flexible and even attempts to improve precision by ignoring the restrictions of user-defined Domain Generalization Hierarchies. The evaluation of the new approaches with respect to cost metrics shows that metrics may behave differently with different algorithms and may not correlate with some applications’ accuracy on output data.  相似文献   

8.
Frequent pattern mining discovers sets of items that frequently appear together in a transactional database; these can serve valuable economic and research purposes. However, if the database contains sensitive data (e.g., user behavior records, electronic health records), directly releasing the discovered frequent patterns with support counts will carry significant risk to the privacy of individuals. In this paper, we study the problem of how to accurately find the top-k frequent patterns with noisy support counts on transactional databases while satisfying differential privacy. We propose an algorithm, called differentially private frequent pattern (DFP-Growth), that integrates a Laplace mechanism and an exponential mechanism to avoid privacy leakage. We theoretically prove that the proposed method is (λ, δ)-useful and differentially private. To boost the accuracy of the returned noisy support counts, we take consistency constraints into account to conduct constrained inference in the post-processing step. Extensive experiments, using several real datasets, confirm that our algorithm generates highly accurate noisy support counts and top-k frequent patterns.  相似文献   

9.
针对动态社会网络数据多重发布中用户的隐私信息泄露问题,结合攻击者基于背景知识的结构化攻击,提出了一种动态社会网络隐私保护方法。该方法首先在每次发布时采用k-同构算法把原始图有效划分为k个同构子图,并最小化匿名成本;然后对节点ID泛化,阻止节点增加或删除时攻击者结合多重发布间的关联识别用户的隐私信息。通过数据集实验证实,提出的方法有较高的匿名质量和较低的信息损失,能有效保护动态社会网络中用户的隐私。  相似文献   

10.
k-anonymity provides a measure of privacy protection by preventing re-identification of data to fewer than a group of k data items. While algorithms exist for producing k-anonymous data, the model has been that of a single source wanting to publish data. Due to privacy issues, it is common that data from different sites cannot be shared directly. Therefore, this paper presents a two-party framework along with an application that generates k-anonymous data from two vertically partitioned sources without disclosing data from one site to the other. The framework is privacy preserving in the sense that it satisfies the secure definition commonly defined in the literature of Secure Multiparty Computation.  相似文献   

11.
Data integration methods enable different data providers to flexibly integrate their expertise and deliver highly customizable services to their customers. Nonetheless, combining data from different sources could potentially reveal person-specific sensitive information. In VLDBJ 2006, Jiang and Clifton (Very Large Data Bases J (VLDBJ) 15(4):316–333, 2006) propose a secure Distributed k-Anonymity (DkA) framework for integrating two private data tables to a k-anonymous table in which each private table is a vertical partition on the same set of records. Their proposed DkA framework is not scalable to large data sets. Moreover, DkA is limited to a two-party scenario and the parties are assumed to be semi-honest. In this paper, we propose two algorithms to securely integrate private data from multiple parties (data providers). Our first algorithm achieves the k-anonymity privacy model in a semi-honest adversary model. Our second algorithm employs a game-theoretic approach to thwart malicious participants and to ensure fair and honest participation of multiple data providers in the data integration process. Moreover, we study and resolve a real-life privacy problem in data sharing for the financial industry in Sweden. Experiments on the real-life data demonstrate that our proposed algorithms can effectively retain the essential information in anonymous data for data analysis and are scalable for anonymizing large data sets.  相似文献   

12.
ProbLog is a probabilistic extension of Prolog. Given the complexity of exact inference under ProbLog??s semantics, in many applications in machine learning approximate inference is necessary. Current approximate inference algorithms for ProbLog however require either dealing with large numbers of proofs or do not guarantee a low approximation error. In this paper we introduce a new approximate inference algorithm which addresses these shortcomings. Given a user-specified parameter k, this algorithm approximates the success probability of a query based on at most k proofs and ensures that the calculated probability p is (1?1/e)p ???p??p ?, where p ? is the highest probability that can be calculated based on any set of k proofs. Furthermore a useful feature of the set of calculated proofs is that it is diverse. Our experiments show the utility of the proposed algorithm.  相似文献   

13.
黄光球  程凯歌 《计算机工程》2011,37(10):131-133
鉴于网络攻击过程中存在攻击者被检测到的可能性,将攻击图转化成Petri网并进行扩展生成EPN模型,依据库所的攻击成本值求解网络攻击的最佳攻击路径和攻击成本,基于最大流概念定义系统最大承受攻击能力。从二维角度分析网络攻击,提出攻击可行性概念及基于攻击图的扩充Petri网攻击模型,该模型相关算法的遍历性由EPN推理规则保证。当原攻击图的弧较多时,算法的复杂度低于Dijkstra算法,攻击图的攻击发起点和攻击目标点间的路径越多,算法越有效。实验结果证明,该模型可以对网络攻击过程进行高效的综合分析。  相似文献   

14.
We present a multidisciplinary solution to the problems of anonymous microaggregation and clustering, illustrated with two applications, namely privacy protection in databases, and private retrieval of location-based information. Our solution is perturbative, is based on the same privacy criterion used in microdata k-anonymization, and provides anonymity through a substantial modification of the Lloyd algorithm, a celebrated quantization design algorithm, endowed with numerical optimization techniques.Our algorithm is particularly suited to the important problem of k-anonymous microaggregation of databases, with a small integer k representing the number of individual respondents indistinguishable from each other in the published database. Our algorithm also exhibits excellent performance in the problem of clustering or macroaggregation, where k may take on arbitrarily large values. We illustrate its applicability in this second, somewhat less common case, by means of an example of location-based services. Specifically, location-aware devices entrust a third party with accurate location information. This party then uses our algorithm to create distortion-optimized, size-constrained clusters, where k nearby devices share a common centroid location, which may be regarded as a distorted version of the original one. The centroid location is sent back to the devices, which use it when contacting untrusted location-based information providers, in lieu of the exact home location, to enforce k-anonymity.We compare the performance of our novel algorithm to the state-of-the-art microaggregation algorithm MDAV, on both synthetic and standardized real data, which encompass the cases of small and large values of k. The most promising aspect of our proposed algorithm is its capability to maintain the same k-anonymity constraint, while outperforming MDAV by a significant reduction in data distortion, in all the cases considered.  相似文献   

15.
An important class of LBSs is supported by the moving k nearest neighbor (MkNN) query, which continuously returns the k nearest data objects for a moving user. For example, a tourist may want to observe the five nearest restaurants continuously while exploring a city so that she can drop in to one of them anytime. Using this kind of services requires the user to disclose her location continuously and therefore may cause privacy leaks derived from the user's locations. A common approach to protecting a user's location privacy is the use of imprecise locations (e.g., regions) instead of exact positions when requesting LBSs. However, simply updating a user's imprecise location to a location-based service provider (LSP) cannot ensure a user's privacy for an MkNN query: continuous disclosure of regions enable LSPs to refine more precise location of the user. We formulate this type of attack to a user's location privacy that arises from overlapping consecutive regions, and provide the first solution to counter this attack. Specifically, we develop algorithms which can process an MkNN query while protecting the user's privacy from the above attack. Extensive experiments validate the effectiveness of our privacy protection technique and the efficiency of our algorithm.  相似文献   

16.
In this paper, we propose a novel edge modification technique that better preserves the communities of a graph while anonymizing it. By maintaining the core number sequence of a graph, its coreness, we retain most of the information contained in the network while allowing changes in the degree sequence, i. e. obfuscating the visible data an attacker has access to. We reach a better trade-off between data privacy and data utility than with existing methods by capitalizing on the slack between apparent degree (node degree) and true degree (node core number). Our extensive experiments on six diverse standard network datasets support this claim. Our framework compares our method to other that are used as proxies for privacy protection in the relevant literature. We demonstrate that our method leads to higher data utility preservation, especially in clustering, for the same levels of randomization and k-anonymity.  相似文献   

17.
社交网络边权重表示节点属性相似性时,针对边权重能导致节点敏感属性泄露的问题,提出一种利用差分隐私保护模型的扰动策略进行边权重保护。首先根据社交网络构建属性相似图和非属性相似图,同时建立差分隐私保护算法;然后对属性相似图及非属性相似图边权重进行扰动时,设计扰动方案,并按扰动方案对属性相似图及非属性相似图进行扰动。实现了攻击者无法根据扰动后边权重判断节点属性相似性,从而防止节点敏感属性泄漏,而且该方法能够抵御攻击者拥有最大背景知识的攻击。从理论上证明了算法的可行性,并通过实验验证了算法的可行性及有效性。  相似文献   

18.
A set of k spanning trees rooted at the same vertex r in a graph G is said to be independent if for each vertex x other than r, the k paths from r to x, one path in each spanning tree, are internally disjoint. Using independent spanning trees (ISTs) one can design fault-tolerant broadcasting schemes and increase message security in a network. Thus, the problem of ISTs on graphs has been received much attention. Recently, Yang et al. proposed a parallel algorithm for generating optimal ISTs on the hypercube. In this paper, we propose a similar algorithm for generating optimal ISTs on Cartesian product of complete graphs. The algorithm can be easily implemented in parallel or distributed systems. Moreover, the proof of its correctness is simpler than that of Yang et al.  相似文献   

19.
Maximal clique enumeration is a fundamental problem in graph theory and has been extensively studied. However, maximal clique enumeration is time-consuming in large graphs and always returns enormous cliques with large overlaps. Motivated by this, in this paper, we study the diversified top-k clique search problem which is to find top-k cliques that can cover most number of nodes in the graph. Diversified top-k clique search can be widely used in a lot of applications including community search, motif discovery, and anomaly detection in large graphs. A naive solution for diversified top-k clique search is to keep all maximal cliques in memory and then find k of them that cover most nodes in the graph by using the approximate greedy max k-cover algorithm. However, such a solution is impractical when the graph is large. In this paper, instead of keeping all maximal cliques in memory, we devise an algorithm to maintain k candidates in the process of maximal clique enumeration. Our algorithm has limited memory footprint and can achieve a guaranteed approximation ratio. We also introduce a novel light-weight \(\mathsf {PNP}\)-\(\mathsf {Index}\), based on which we design an optimal maximal clique maintenance algorithm. We further explore three optimization strategies to avoid enumerating all maximal cliques and thus largely reduce the computational cost. Besides, for the massive input graph, we develop an I/O efficient algorithm to tackle the problem when the input graph cannot fit in main memory. We conduct extensive performance studies on real graphs and synthetic graphs. One of the real graphs contains 1.02 billion edges. The results demonstrate the high efficiency and effectiveness of our approach.  相似文献   

20.
为了解决数据发布和分析过程中用户真实数据信息被披露的问题,降低攻击者通过差分攻击和概率推理攻击获取真实结果的概率,提出了一种基于置信度分析的差分隐私保护参数配置方法。在攻击者概率推理攻击模型下对攻击者置信度进行分析,使之不高于根据数据隐私属性所设置的隐私概率阈值。所提出的方法能够针对不同查询用户查询权限的差异配置更加合理的隐私保护参数,避免了隐私披露的风险。实验分析表明,所提出的方法根据查询权限、噪声分布特性以及数据隐私属性分析攻击者置信度与隐私保护参数的对应关系,并据此推导出隐私保护参数的配置公式,从而在不违背隐私保护概率阈值的情况下配置合适的ε参数。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号