首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
数据发布中面向多敏感属性的隐私保护技术*   总被引:1,自引:0,他引:1  
针对多敏感属性数据发布中存在的隐私泄露问题,在分析多维桶分组技术的基础上,继承了基于有损连接对隐私数据进行保护的思想,提出了一种(g,l)-分组方法,首先对多敏感属性根据各自的敏感度进行分组,然后将分组数作为多维桶的各个维的维数。同时还给出了2种不同的线性时间的分组算法:一般(g,l)-分组算法(GGLG)和最大敏感度优先算法(MSF)。实际数据集上的大量实验结果表明,该方法可以明显地减少隐私泄露,增强数据发布的安全性。  相似文献   

2.
多维敏感属性隐私保护数据发布方法   总被引:2,自引:0,他引:2  
在匿名数据发布中,当敏感属性为多维时,攻击者有可能能够获取一维或几维敏感属性信息,并且结合准标识符信息对其他敏感属性进行推理攻击。针对此问题提出(Dou-l)-匿名模型,更好地保护了敏感信息。基于多维桶和分解思想,提出(Dou-l)-匿名算法,使得即便攻击者掌握了部分敏感数据,仍然能较好地保护其他敏感属性数据的隐私安全性。实际数据实验证明,算法可以较好地均衡发布数据的安全性和可用性。  相似文献   

3.
An approximate microaggregation approach for microdata protection   总被引:1,自引:0,他引:1  
Microdata protection is a hot topic in the field of Statistical Disclosure Control, which has gained special interest after the disclosure of 658,000 queries by the America Online (AOL) search engine in August 2006. Many algorithms, methods and properties have been proposed to deal with microdata disclosure. One of the emerging concepts in microdata protection is k-anonymity, introduced by Samarati and Sweeney. k-Anonymity provides a simple and efficient approach to protect private individual information and is gaining increasing popularity. k-Anonymity requires that every record in the microdata table released be indistinguishably related to no fewer than k respondents.In this paper, we apply the concept of entropy to propose a distance metric to evaluate the amount of mutual information among records in microdata, and propose a method of constructing dependency tree to find the key attributes, which we then use to process approximate microaggregation. Further, we adopt this new microaggregation technique to study k-anonymity problem, and an efficient algorithm is developed. Experimental results show that the proposed microaggregation technique is efficient and effective in the terms of running time and information loss.  相似文献   

4.
The collection, processing, and selling of personal data is an integral part of today’s electronic markets, either as means for operating business, or as an asset itself. However, the exchange of sensitive information between companies is limited by two major issues: Firstly, regulatory compliance with laws such as SOX requires anonymization of personal data prior to transmission to other parties. Secondly, transmission always implicates some loss of control over the data since further dissemination is possible without knowledge of the owner. In this paper, we extend an approach based on the utilization of k-anonymity that aims at solving both concerns in one single step - anonymization and fingerprinting of microdata such as database records. Furthermore, we develop criteria to achieve detectability of colluding attackers, as well as an anonymization strategy that resists combined efforts of colluding attackers on reducing the anonymization-level. Based on these results we propose an algorithm for the generation of collusion-resistant fingerprints for microdata.  相似文献   

5.
Data publishing has generated much concern on individual privacy. Recent work has shown that different background knowledge can bring various threats to the privacy of published data. In this paper, we study the privacy threat from the full functional dependency (FFD) that is used as part of adversary knowledge. We show that the cross-attribute correlations by FFDs (e.g., Phone → Zipcode) can bring potential vulnerability. Unfortunately, none of the existing anonymization principles (e.g., k-anonymity, ?-diversity, etc.) can effectively prevent against an FFD-based privacy attack. We formalize the FFD-based privacy attack and define the privacy model, (d,?)-inference, to combat the FD-based attack. We distinguish the safe FFDs that will not jeopardize privacy from the unsafe ones. We design robust algorithms that can efficiently anonymize the microdata with low information loss when the unsafe FFDs are present. The efficiency and effectiveness of our approach are demonstrated by the empirical study.  相似文献   

6.
为了解决多维数值型敏感属性数据隐私保护方法中存在的准标识符属性信息损失大,以及不能满足用户对数值型敏感属性重要性排序的个性化需求问题,提出一种基于聚类和加权多维桶分组(MSB)的个性化隐私保护方法。首先,根据准标识符的相似程度,将数据集划分成若干准标识符属性值相近的子集;然后,考虑到用户对敏感属性的敏感程度不同,将敏感程度和多维桶的桶容量用于计算加权选择度和构建加权多维桶;最后,依此对数据进行分组和匿名化处理。选用UCI的标准Adult数据集中的8个属性进行实验,并与基于聚类和多维桶的数据隐私保护方法MNSACM和基于聚类和加权多维桶分组的个性化隐私保护方法WMNSAPM进行对比。实验结果表明,所提方法整体较优,并且在减少信息损失和运行时间方面明显优于对比方法,提高了数据质量和运行效率。  相似文献   

7.
The problem of flash data dissemination refers to transmitting time‐critical data to a large group of distributed receivers in a timely manner, which widely exists in many mission‐critical applications and Web services. However, existing approaches for flash data dissemination fail to ensure the timely and efficient transmission, because of the unpredictability of the dissemination process. Overlay routing has been widely used as an efficient routing primitive for providing better end‐to‐end routing quality by detouring inefficient routing paths in the real networks. To improve the predictability of the flash data dissemination process, we propose a bandwidth and latency sensitive overlay routing approach named BLOR, by optimizing the overlay routing and avoiding inefficient paths in flash data dissemination. BLOR tries to select optimal routing paths in terms of network latency, bandwidth capacity, and available bandwidth in nature, which has never been studied before. Additionally, a location‐aware unstructured overlay topology construction algorithm, an unbiased top‐k dominance model, and an efficient semi‐distributed information management strategy are proposed to assist the routing optimization of BLOR. Extensive experiments have been conducted to verify the effectiveness and efficiency of the proposals with real‐world data sets. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

8.
Ship routing problems are a particular kind of routing problems where the vehicles to be routed are vessels or ships, usually in maritime environments. In contrast to land routing, ship routing has unique features, including overnight trips, disjoint time windows, not necessarily prespecified routes, and a great uncertainty derived from weather conditions. In this work we present a special ship routing problem, which incorporates many features present in general ship routing settings. We discuss aspects related with data gathering and updating, which are particularly difficult in the context of ship routing. Additionally, we present a GRASP algorithm to solve this problem. We apply our solution approach to a salmon feed supplier based in southern Chile, and present computational results on real data.  相似文献   

9.
对数据发布中传统方法脱敏多元组关系-集值数据可能导致信息泄露以及产生较高信息损失的问题进行研究,提出基于(K,L)-多样性模型的多元组关系-集值数据的脱敏方法PAHI.根据准标识符将多元组数据转换为单元组数据;用信息增益比优化分割方法,实现集值数据K-匿名;引入敏感度值建立集值指纹桶,采用敏感度距离优化剩余元组的处理,...  相似文献   

10.
ABSTRACT

Multiple vehicle tracking (MVT) in the aerial video sequence can provide useful information for the applications such as traffic flow analysis. It is challenging due to the high requirement for the tracking efficiency and variable number of the vehicles. Furthermore, it is particularly challenging if the vehicles are occluded by the shadow of the trees, buildings, and large vehicles. In this article, an efficient and flexible MVT approach in the aerial video sequence is put forward. First, as the pre-step to approach the MVT problem, the superpixel segmentation-based multiple vehicle detection (MVD) is achieved by combining the two-frame difference and superpixel segmentation. The two-frame difference is utilized to reduce the search space. By scanning the search space via the centre of the superpixels, the moving vehicles are detected efficiently. Then, the deterministic data association is proposed to tackle the MVT problem. To improve the tracking accuracy, we fuse multiple types of features to establish the cost function. With respect to the variable number of the vehicles, the tracker management is designed by establishing or deleting the trackers. Furthermore, for the occlusion handling, we focus on the accurate state estimation, and it is realized by the driver behaviour-based Kalman filter (DBKF) method. In the DBKF method, we take seriously into account the driver behaviour, including the speed limit and rear-end collision avoidance with the front vehicle. Both tracker management and occlusion handling make the MVT approach flexibly cope with varieties of traffic scenes. Finally, comprehensive experiments on the DARPA VIVID data set and KIT AIS data set demonstrate that the proposed MVT algorithm can generate satisfactory and superior results.  相似文献   

11.
An efficient algorithm for matching multiple patterns   总被引:6,自引:0,他引:6  
An efficient algorithm for performing multiple pattern match in a string is described. The match algorithm combines the concept of deterministic finite state automata (DFSA) and the Boyer-Moore algorithm to achieve better performance. Experimental results indicate that in the average case, the algorithm is able to perform pattern match operations sublinearly, i.e. it does not need to inspect every character of the string to perform pattern match operations. The analysis shows that the number of characters to be inspected decreases as the length of patterns increases, and increases slightly as the total number of patterns increases. To match an eight-character pattern in an English string using the algorithm, only about 17% of all characters of the strong and 33% of all characters of the string, when the number of patterns is seven, are inspected. In an actual testing, the algorithm running on SUN 3/160 takes only 3.7 s to search seven eight-character patterns in a 1.4-Mbyte English text file  相似文献   

12.
Li  Songyuan  Shen  Hong  Sang  Yingpeng  Tian  Hui 《The Journal of supercomputing》2020,76(7):5276-5300
The Journal of Supercomputing - Since Osman Abul et al. first proposed the k-anonymity-based privacy protection for trajectory data, researchers have proposed a variety of trajectory...  相似文献   

13.
ContextApplication of a refactoring operation creates a new set of dependency in the revised design as well as a new set of further refactoring candidates. In the studies of stepwise refactoring recommendation approaches, applying one refactoring at a time has been used, but is inefficient because the identification of the best candidate in each iteration of refactoring identification process is computation-intensive. Therefore, it is desirable to accurately identify multiple and independent candidates to enhance efficiency of refactoring process.ObjectiveWe propose an automated approach to identify multiple refactorings that can be applied simultaneously to maximize the maintainability improvement of software. Our approach can attain the same degree of maintainability enhancement as the method of the refactoring identification of the single best one, but in fewer iterations (lower computation cost).MethodThe concept of maximal independent set (MIS) enables us to identify multiple refactoring operations that can be applied simultaneously. Each MIS contains a group of refactoring candidates that neither affect (i.e., enable or disable) nor influence maintainability on each other. Refactoring effect delta table quantifies the degree of maintainability improvement each elementary candidate. For each iteration of the refactoring identification process, multiple refactorings that best improve maintainability are selected among sets of refactoring candidates (MISs).ResultsWe demonstrate the effectiveness and efficiency of the proposed approach by simulating the refactoring operations on several large-scale open source projects such as jEdit, Columba, and JGit. The results show that our proposed approach can improve maintainability by the same degree or to a better extent than the competing method, choosing one refactoring candidate at a time, in a significantly smaller number of iterations. Thus, applying multiple refactorings at a time is both effective and efficient.ConclusionOur proposed approach helps improve the maintainability as well as the productivity of refactoring identification.  相似文献   

14.
In location-based services, a density query returns the regions with high concentrations of moving objects (MOs). The use of density queries can help users identify crowded regions so as to avoid congestion. Most of the existing methods try very hard to improve the accuracy of query results, but ignore query efficiency. However, response time is also an important concern in query processing and may have an impact on user experience. In order to address this issue, we present a new definition of continuous density queries. Our approach for processing continuous density queries is based on the new notion of a safe interval, using which the states of both dense and sparse regions are dynamically maintained. Two indexing structures are also used to index candidate regions for accelerating query processing and improving the quality of results. The efficiency and accuracy of our approach are shown through an experimental comparison with snapshot density queries.  相似文献   

15.
In this paper,we examine the issue of learning multiple predicates from given training examples.A proposed MPL-CORE algorithm efficiently induces Horm clauses from examples and background knowledge by employing a single predicate learning module CORE.A fast failure mechanism is also proposed which contributes learning efficiency and learnability to the algorithm.MPL-CORE employs background knowledge that can be represented in intensional(Horn clauses)or extensional (ground atoms)orms during its learning process.With the fast failure mechanism,MPL-CORE outperforms precious multiple predicate learning systems in both the computational complexity and learnability.  相似文献   

16.
Periodic broadcast is a cost-effective solution for large-scale distribution of popular videos. Regardless of the number of video requests, this strategy guarantees a constant worst service latency to all clients, making it possible to serve a large community with a minimal amount of broadcast bandwidth. Although many efficient periodic broadcast techniques have been proposed, most of them impose rigid requirements on client receiving bandwidth. They either demand clients to have the same bandwidth as the video server, or limit them to receive no more than two video streams at any one time. In our previous work, we addressed this problem with a Client-Centric Approach (CCA). This scheme takes into consideration both server broadcast bandwidth and client receiving bandwidth and allows clients to use all their receiving capability for prefetching broadcast data. As a result, given a fixed broadcast bandwidth, a shorter broadcast period can be achieved with an improved client communication capability. In this paper, we present an enhanced version of CCA to further leverage client bandwidth for more efficient video broadcast. The new scheme reduces the broadcast latency up to 50% as compared to CCA. We prove the correctness of this new technique and provide an analytical evaluation to show its performance advantage as compared with some existing techniques.
Johnny WongEmail:
  相似文献   

17.
Color vision deficiency (CVD) affects a high percentage of the population worldwide. When seeing a volume visualization result, persons with CVD may be incapable of discriminating the classification information expressed in the image if the color transfer function or the color blending used in the direct volume rendering is not appropriate. Conventional methods used to address this problem adopt advanced image recoloring techniques to enhance the rendering results frame-by-frame; unfortunately, problematic perceptual results may still be generated. This paper proposes an alternative solution that complements the image recoloring scheme by reconfiguring the components of the direct volume rendering (DVR) pipeline. Our approach optimizes the mapped colors of a transfer function to simulate CVD-friendly effect that is generated by applying the image recoloring to the results with the initial transfer function. The optimization process has a low computational complexity, and only needs to be performed once for a given transfer function. To achieve detail-preserving and perceptually natural semi-transparent effects, we introduce a new color composition mode that works in the color space of dichromats. Experimental results and a pilot study demonstrates that our approach can yield dichromats-friendly and consistent volume visualization in real-time.  相似文献   

18.
Multidimensional scaling (MDS) is a useful mathematical tool that enables the analysis of data in areas where organized concepts and underlying dimensions are not well developed. In this paper, MDS algorithms are used as a dimension reduction tool which arranges facilities in a two-dimensional space while preserving the adjacency relationship between facilities. The output of MDS is a scatter diagram and is in turn used as the input or location references for developing into the final block layout. The bay structures of layout are considered where the given floor space is first partitioned horizontally or vertically into bays, which are subsequently partitioned into the blocks. Rotating the scatter diagram about the origin results in different layouts in the bay structure. A simulated annealing approach is adopted to rotate the scatter diagram so that the total cost of traveling between facilities and shape violation in the final layout is minimized.Scope and purposeThis paper discusses the application of multidimensional scaling (MDS) and simulated annealing (SA) to efficiently design facility layout. MDS is a powerful mathematical tool widely used in psychometry as well as in marketing research. By using MDS to generate a reference scatter diagram, a layout can subsequently be developed. The SA algorithm is then applied to rotate the scatter diagram from MDS so that a layout with the total cost of traveling between facilities being minimized can be obtained.  相似文献   

19.
20.
为了防止数据敏感属性的泄露,需要对数据敏感属性进行匿名保护。针对l-多样性模型当前已提出的算法大多是建立在概念层次结构的基础上,该方法会导致不必要的信息损失。为此,将基于属性泛化层次距离KACA算法中的距离度量方法与聚类结合,提出了一种基于聚类的数据敏感属性匿名保护算法。该算法按照l-多样性模型的要求对数据集进行聚类。实验结果表明,该算法既能对数据中的敏感属性值进行匿名保护,又能降低信息的损失程度。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号