首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 248 毫秒
1.
移动互联网和LBS技术的高速发展使得位置服务提供商可以轻松收集到大量用户位置轨迹数据,近期研究表明,深度学习方法能够从轨迹数据集中提取出用户身份标识等隐私信息。然而现有工作主要针对社交网络采集的签到点轨迹,针对GPS轨迹的去匿名研究则较为缺乏。因此,对基于深度学习的GPS轨迹去匿名技术开展研究。首先提出一种GPS轨迹数据预训练方法,经过子轨迹划分、位置点转化和位置点嵌入,原始GPS轨迹中的空间距离和上下文信息被嵌入到定长向量中,使得GPS轨迹数据能够作为神经网络的输入。其次提出一种基于深度神经网络训练的GPS轨迹去匿名方法,基于预训练得到的向量序列,采用LSTM、GRU等神经网络作为编码器训练拟合用户标识,实现匿名轨迹数据的用户关联。最后基于Geolife轨迹数据集对上述方法进行验证,实验中轨迹去匿名的准确率和Top5准确率分别达到了56.73%和73.48%,实验结果表明,基于深度学习的GPS轨迹去匿名方法能够从匿名轨迹数据中较为准确地识别出用户标识。  相似文献   

2.
With the rapid growth of social network applications, more and more people are participating in social networks. Privacy protection in online social networks becomes an important issue. The illegal disclosure or improper use of users’ private information will lead to unaccepted or unexpected consequences in people’s lives. In this paper, we concern on authentic popularity disclosure in online social networks. To protect users’ privacy, the social networks need to be anonymized. However, existing anonymization algorithms on social networks may lead to nontrivial utility loss. The reason is that the anonymization process has changed the social network’s structure. The social network’s utility, such as retrieving data files, reading data files, and sharing data files among different users, has decreased. Therefore, it is a challenge to develop an effective anonymization algorithm to protect the privacy of user’s authentic popularity in online social networks without decreasing their utility. In this paper, we first design a hierarchical authorization and capability delegation (HACD) model. Based on this model, we propose a novel utility-based popularity anonymization (UPA) scheme, which integrates proxy re-encryption with keyword search techniques, to tackle this issue. We demonstrate that the proposed scheme can not only protect the users’ authentic popularity privacy, but also keep the full utility of the social network. Extensive experiments on large real-world online social networks confirm the efficacy and efficiency of our scheme.  相似文献   

3.
The problem of anonymization in large networks and the utility of released data are considered in this paper. Although there are some anonymization methods for networks, most of them cannot be applied in large networks because of their complexity. In this paper, we devise a simple and efficient algorithm for k-degree anonymity in large networks. Our algorithm constructs a k-degree anonymous network by the minimum number of edge modifications. We compare our algorithm with other well-known k-degree anonymous algorithms and demonstrate that information loss in real networks is lowered. Moreover, we consider the edge relevance in order to improve the data utility on anonymized networks. By considering the neighbourhood centrality score of each edge, we preserve the most important edges of the network, reducing the information loss and increasing the data utility. An evaluation of clustering processes is performed on our algorithm, proving that edge neighbourhood centrality increases data utility. Lastly, we apply our algorithm to different large real datasets and demonstrate their efficiency and practical utility.  相似文献   

4.
The race for innovation has turned into a race for data. Rapid developments of new technologies, especially in the field of artificial intelligence, are accompanied by new ways of accessing, integrating, and analyzing sensitive personal data. Examples include financial transactions, social network activities, location traces, and medical records. As a consequence, adequate and careful privacy management has become a significant challenge. New data protection regulations, for example in the EU and China, are direct responses to these developments. Data anonymization is an important building block of data protection concepts, as it allows to reduce privacy risks by altering data. The development of anonymization tools involves significant challenges, however. For instance, the effectiveness of different anonymization techniques depends on context, and thus tools need to support a large set of methods to ensure that the usefulness of data is not overly affected by risk-reducing transformations. In spite of these requirements, existing solutions typically only support a small set of methods. In this work, we describe how we have extended an open source data anonymization tool to support almost arbitrary combinations of a wide range of techniques in a scalable manner. We then review the spectrum of methods supported and discuss their compatibility within the novel framework. The results of an extensive experimental comparison show that our approach outperforms related solutions in terms of scalability and output data quality—while supporting a much broader range of techniques. Finally, we discuss practical experiences with ARX and present remaining issues and challenges ahead.  相似文献   

5.
匿名化隐私保护技术研究进展*   总被引:1,自引:1,他引:0  
匿名化是目前数据发布环境下实现隐私保护的主要技术之一。阐述了匿名化技术的一般概念和基本原理,并从匿名化原则、匿名化方法和匿名化度量等方面对匿名化技术进行了总结,最后指出匿名化技术的研究难点以及未来的研究方向。  相似文献   

6.
In data publishing, anonymization techniques have been designed to provide privacy protection. Anatomy is an important techniques for privacy preserving in data publication and attracts considerable attention in the literature. However, anatomy is fragile under background knowledge attack and the presence attack. In addition, anatomy can only be applied into limited applications. To overcome these drawbacks, we propose an improved version of anatomy: permutation anonymization, a new anonymization technique that is more effective than anatomy in privacy protection, and in the meanwhile is able to retain significantly more information in the microdata. We present the detail of the technique and build the underlying theory of the technique. Extensive experiments on real data are conducted, showing that our technique allows highly effective data analysis, while offering strong privacy guarantees.  相似文献   

7.
Social networks collect enormous amounts of user personal and behavioral data, which could threaten users' privacy if published or shared directly. Privacy-preserving graph publishing (PPGP) can make user data available while protecting private information. For this purpose, in PPGP, anonymization methods like perturbation and generalization are commonly used. However, traditional anonymization methods are challenging in balancing high-level privacy and utility, ineffective at defending against both various link and hybrid inference attacks, as well as vulnerable to graph neural network (GNN)-based attacks. To solve those problems, we present a novel privacy-disentangled approach that disentangles private and non-private information for a better privacy-utility trade-off. Moreover, we propose a unified graph deep learning framework for PPGP, denoted privacy-disentangled variational information bottleneck (PDVIB). Using low-dimensional perturbations, the model generates an anonymized graph to defend against various inference attacks, including GNN-based attacks. Particularly, the model fits various privacy settings by employing adjustable perturbations at the node level. With three real-world datasets, PDVIB is demonstrated to generate robust anonymous graphs that defend against various privacy inference attacks while maintaining the utility of non-private information.  相似文献   

8.
We present GSUVis, a visualization tool designed to provide better understanding of location‐based social network (LBSN) data. LBSN data is one of the most important sources of information for transportation, marketing, health, and public safety. LBSN data consumers are interested in accessing and analysing data that is as complete and as accurate as possible. However, LBSN data contains sensitive information about individuals. Consequently, data anonymization is of critical importance if this data is to be made available to consumers. However, anonymization commonly reduces the utility of information available. Working with privacy experts, we designed GSUVis a visual analytic tool to help experts better understand the effects of anonymization techniques on LBSN data utility. One of GSUVis's primary goals is to make it possible for people to use LBSN data, without requiring them to gain deep knowledge about data anonymization. To inform the design of GSUVis, we interviewed privacy experts, and collected their tasks and system requirements. Based on this understanding, we designed and implemented GSUVis. It applies two anonymization algorithms for social and location trajectory data to a real‐world LBSN dataset and visualizes the data both before and after anonymization. Through feedback from domain experts, we reflect on the effectiveness of GSUVis and the impact of anonymization using visualization.  相似文献   

9.
匿名通信系统诞生之初是为了保护通信实体身份的匿名性和网络中通信内容的隐私性、完整性,但随着匿名通信系统的广泛使用,其匿名性不断增强,在隐藏服务技术的支持下,匿名通信系统被不法分子滥用的情况愈演愈烈,在匿名通信系统隐藏服务技术支持下的暗网平台已然成为了"法外之地".站在网络监管部门的立场上,对匿名通信系统,尤其是匿名通信...  相似文献   

10.
为了保护社会网络隐私信息,提出了多种社会网络图匿名化技术.图匿名化目的在于通过图修改操作来防止隐私泄露,同时保证匿名图在社会网络分析和图查询方面的数据可用性.可达性查询是一种基本图查询操作,可达性查询精度是衡量图数据可用性的一项重要指标.然而,当前研究忽略了图匿名对结点可达性的影响,导致较大的可达性信息损失.为了保持匿名图中结点的可达性,提出了可达性保持图匿名化(reachability preserving anonymization,简称RPA)算法,其基本思想是将结点进行分组并采取贪心策略进行匿名,从而减少匿名过程中的可达性信息损失.为了保证RPA算法的实用性,针对其执行效率进行优化,首先提出采用可达区间来高效地评估边添加操作所导致的匿名损失;其次,通过采用候选邻居索引,进一步加速RPA算法对每个结点的匿名过程.基于真实社会网络数据的实验结果表明了RPA算法的高执行效率,同时验证了生成匿名图在可达性查询方面的高精度.  相似文献   

11.
Recently, more and more social network data have been published in one way or another. Preserving privacy in publishing social network data becomes an important concern. With some local knowledge about individuals in a social network, an adversary may attack the privacy of some victims easily. Unfortunately, most of the previous studies on privacy preservation data publishing can deal with relational data only, and cannot be applied to social network data. In this paper, we take an initiative toward preserving privacy in social network data. Specifically, we identify an essential type of privacy attacks: neighborhood attacks. If an adversary has some knowledge about the neighbors of a target victim and the relationship among the neighbors, the victim may be re-identified from a social network even if the victim’s identity is preserved using the conventional anonymization techniques. To protect privacy against neighborhood attacks, we extend the conventional k-anonymity and l-diversity models from relational data to social network data. We show that the problems of computing optimal k-anonymous and l-diverse social networks are NP-hard. We develop practical solutions to the problems. The empirical study indicates that the anonymized social network data by our methods can still be used to answer aggregate network queries with high accuracy.  相似文献   

12.
Use of Public Participation Geographic Information System (PPGIS) for data collection has been significantly growing over the past few years in different areas of research and practice. With the growing amount of data, there is little doubt that a potentially wider community can benefit from open access to them. Additionally, open data add to the transparency of research and can be considered as an essential feature of science. However, data anonymization is a complex task and the unique characteristics of PPGIS add to this complexity. PPGIS data often include personal spatial and non-spatial information, which essentially require different approaches for anonymization. In this study, we first identify different privacy concerns and then develop a PPGIS data anonymization strategy to overcome them for an open PPGIS data. Specifically, this article introduces a context-sensitive spatial anonymization method to protect individual home locations while maintaining their spatial resolution for mapping purposes. Furthermore, this study empirically evaluates the effects of data anonymization on PPGIS data quality. The results indicate that a satisfactory level of anonymization can be reached using this approach. Moreover, the assessment results indicate that the environmental and home range measurements as well as their intercorrelations are not significantly biased by the anonymization. However, necessary analytical measures such as use of larger spatial units is recommendable when anonymized data is used. In this study, European data protection regulations were used as the legal guidelines. However, adaptation of methods employed in this study may be also relevant to other countries where comparable regulations exist. Although specifically targeted at PPGIS data, what is discussed in this paper can be applicable to other similar spatial datasets as well.  相似文献   

13.
提出一种基于取整划分函数的k匿名算法,并从理论上证明该算法在非平凡的数据集中可以取得更低的上界.特别地,当数据集大于2k2时,该算法产生的匿名化数据的匿名组规模的上界为k+1;而当待发布数据表足够大时,算法所生成的所有匿名组的平均规模将足够趋近于k.仿真实验结果表明,该算法是有效而可行的.  相似文献   

14.
医疗数据发布中属性顺序敏感的隐私保护方法   总被引:2,自引:1,他引:1  
高爱强  刁麓弘 《软件学报》2009,20(Z1):314-320
隐私保护已成为包含微数据应用诸如医疗数据发布共享或数据挖掘中的一个重要问题.基于全局重编码或局部重编码的匿名性方法,通过保证每一条数据记录都至少有某个数量的其他记录与其具有同样的特征来保护隐私性.如果考虑到对处理后的数据进行属性顺序敏感的数据分析任务,这类方法并不能很好地完成任务.研究基于数据可用性指标的匿名性方法,着重考虑数据分析任务中的属性顺序对于匿名性方法的影响.从多维数据匿名的概念出发,讨论用于该类情况下的数据匿名性方法.在公开数据集上的实验结果表明,该方法对于上述问题是有效的,并且效率并未受到影响.  相似文献   

15.
张志祥  金华  朱玉全  陈耿 《计算机工程与设计》2011,32(9):2938-2942,3018
数据表的k-匿名化(k-anonymization)是数据发布环境下保护数据隐私的一种重要方法,在此基础上提出的(,)-匿名模型则是有效的个性化隐私保护方法,泛化/隐匿是实现匿名化的传统技术,然而该技术存在效率低,信息损失量大等缺陷。针对上述问题,引入有损连接的思想,提出了基于贪心策略的(,)-匿名聚类算法,该方法通过准标识符属性和敏感属性间的有损连接来保护隐私数据。实验结果表明,与泛化/隐匿方法相比,该方法在信息损失量和时间效率上具有明显的优势,可以获得更好的隐私信息保护。  相似文献   

16.

The development of digital media, the increasing use of social networks, the easier access to modern technological devices, is perturbing thousands of people in their public and private lives. People love posting their personal news without consider the risks involved. Privacy has never been more important. Privacy enhancing technologies research have attracted considerable international attention after the recent news against users personal data protection in social media websites like Facebook. It has been demonstrated that even when using an anonymous communication system, it is possible to reveal user’s identities through intersection attacks or traffic analysis attacks. Combining a traffic analysis attack with Analysis Social Networks (SNA) techniques, an adversary can be able to obtain important data from the whole network, topological network structure, subset of social data, revealing communities and its interactions. The aim of this work is to demonstrate how intersection attacks can disclose structural properties and significant details from an anonymous social network composed of a university community.

  相似文献   

17.
In data mining and knowledge discovery, there are two conflicting goals: privacy protection and knowledge preservation. On the one hand, we anonymize data to protect privacy; on the other hand, we allow miners to discover useful knowledge from anonymized data. In this paper, we present an anonymization method which provides both privacy protection and knowledge preservation. Unlike most anonymization methods, where data are generalized or permuted, our method anonymizes data by randomly breaking links among attribute values in records. By data randomization, our method maintains statistical relations among data to preserve knowledge, whereas in most anonymization methods, knowledge is lost. Thus the data anonymized by our method maintains useful knowledge for statistical study. Furthermore, we propose an enhanced algorithm for extra privacy protection to tackle the situation where the user’s prior knowledge of original data may cause privacy leakage. The privacy levels and the accuracy of knowledge preservation of our method, along with their relations to the parameters in the method are analyzed. Experiment results demonstrate that our method is effective on both privacy protection and knowledge preservation comparing with existing methods.  相似文献   

18.
This paper presents a delay-tolerant mix-zone framework for protecting the location privacy of mobile users against continuous query correlation attacks. First, we describe and analyze the continuous query correlation attacks (CQ-attacks) that perform query correlation based inference to break the anonymity of road network-aware mix-zones. We formally study the privacy strengths of the mix-zone anonymization under the CQ-attack model and argue that spatial cloaking or temporal cloaking over road network mix-zones is ineffective and susceptible to attacks that carry out inference by combining query correlation with timing correlation (CQ-timing attack) and transition correlation (CQ-transition attack) information. Next, we introduce three types of delay-tolerant road network mix-zones (i.e., temporal, spatial and spatio-temporal) that are free from CQ-timing and CQ-transition attacks and in contrast to conventional mix-zones, perform a combination of both location mixing and identity mixing of spatially and temporally perturbed user locations to achieve stronger anonymity under the CQ-attack model. We show that by combining temporal and spatial delay-tolerant mix-zones, we can obtain the strongest anonymity for continuous queries while making acceptable tradeoff between anonymous query processing cost and temporal delay incurred in anonymous query processing. We evaluate the proposed techniques through extensive experiments conducted on realistic traces produced by GTMobiSim on different scales of geographic maps. Our experiments show that the proposed techniques offer high level of anonymity and attack resilience to continuous queries.  相似文献   

19.
Mobile crowd sensing (MCS) assumes a collaborative effort from mobile smartphone users to sense and share their data needed to fulfill a given MCS objective (e.g., modeling of urban traffic or wellness index of a community). In this paper, we investigate the user’s perception of anonymity in MCS and factors influencing it. We conducted a 4-week extensive smartphone user study to fulfill three main objectives. (1) Understand if users prefer to share data anonymously or not anonymously. (2) Investigate the possible factors influencing the difference between these two modalities, considering: (a) users’ sharing attitude, (b) shared data kind and (c) users’ intimacy when data are shared (we defined intimacy as the users’ perception of their context with respect to place, number and kind of people around them). (3) Identify further users’ personal factors influencing their perception of anonymity via multiple interviews along the user study. In the results, we show that data are shared significantly more when anonymously collected. We found that the shared data kind is the factor significantly contributing to this difference. Additionally, users have a common way to perceive anonymity and its effectiveness. To ensure the success of anonymization algorithms in the context of MCS systems, we highlight which issues the researchers developing these algorithms should carefully consider. Finally, we argue about new research paths to better investigate the user perception of anonymity and develop anonymous MCS systems that users are more likely to trust based on our findings.  相似文献   

20.
In recent years, online social networks have become a part of everyday life for millions of individuals. Also, data analysts have found a fertile field for analyzing user behavior at individual and collective levels, for academic and commercial reasons. On the other hand, there are many risks for user privacy, as information a user may wish to remain private becomes evident upon analysis. However, when data is anonymized to make it safe for publication in the public domain, information is inevitably lost with respect to the original version, a significant aspect of social networks being the local neighborhood of a user and its associated data. Current anonymization techniques are good at identifying risks and minimizing them, but not so good at maintaining local contextual data which relate users in a social network. Thus, improving this aspect will have a high impact on the data utility of anonymized social networks. Also, there is a lack of systems which facilitate the work of a data analyst in anonymizing this type of data structures and performing empirical experiments in a controlled manner on different datasets. Hence, in the present work we address these issues by designing and implementing a sophisticated synthetic data generator together with an anonymization processor with strict privacy guarantees and which takes into account the local neighborhood when anonymizing. All this is done for a complex dataset which can be fitted to a real dataset in terms of data profiles and distributions. In the empirical section we perform experiments to demonstrate the scalability of the method and the improvement in terms of reduction of information loss with respect to approaches which do not consider the local neighborhood context when anonymizing.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号