首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 78 毫秒
1.
布鲁姆过滤器查询算法   总被引:12,自引:0,他引:12  
从理论和应用两方面系统地综述了布鲁姆过滤器查询算法迄今为止的主要研究成果,分析了目前布鲁姆过滤器查询算法的研究现状,最后展望了布鲁姆过滤器查询算法未来可能的研究方向.  相似文献   

2.
文中探讨计数布鲁姆过滤器的代数运算和集合运算的一致性关系,研究使用计数布鲁姆过滤器代数运算进行集合成员查询的性能.理论分析和实验结果表明,计数布鲁姆过滤器的并、交、补、减、异或运算产生的新过滤器依然保持计数布鲁姆过滤器的特征,支持元素的删除操作,不会出现假阴性,能用于集合并集、交集、补集、差集及对称差的成员查询;当使用两个原始的计数布鲁姆过滤器查询补集、差集及对称差元素时,会存在部分本来属于补集、差集或对称差的元素被判为不属于补集、差集或对称差的问题,而使用计数布鲁姆过滤器代数运算后的过滤器进行补集、差集及对称差成员查询,则不存在上述问题,空间效率能提高一倍,时间效率亦能显著地得到改善.计数布鲁姆过滤器代数运算的使用有利于进一步扩展计数布鲁姆过滤器的应用范围.譬如计数布鲁姆过滤器减运算可用作一种新的集合调和方法,用于分布式系统中大型文件的分发.  相似文献   

3.
分档布鲁姆过滤器的查询算法   总被引:8,自引:0,他引:8  
布鲁姆过滤器是一种能够简洁地表示集合并支持集合查询的数据结构,广泛应用于数据库、网络和分布式系统中.针对现有的布鲁姆过滤器没有考虑查询失效代价这一缺陷,文中提出一种新的代价敏感的分档布鲁姆过滤器查询算法.它将元素根据不同的查询代价分为不同的子集,通过考查每档子集最低查询失效率的关系,建立由每档子集合最低查询失效假阳性概率表示的集合最低查询失效总代价目标函数,使用类目标函数梯度遗传算法获得每档的最优Hash函数个数ki,完成集合到向量的映射与查找.仿真实验结果表明,使用新结构的查询算法和标准布鲁姆过滤器算法相比,所用的查询计算时间基本相同,因为区分对待集合元素,查询失效总代价仅为标准算法的27%.  相似文献   

4.
布鲁姆过滤器(Bloom filter)对数据集合采用一个位串表示并能有效支持元素的哈希查找,是一种精简的信息表示方案,广泛应用于数据库、网络和分布式系统中.本文研究布鲁姆过滤器的序列分析方法,通过定义布鲁姆过滤器距离,用概率统计方法分析动态数据集合元素增加和删除的变化对布鲁姆过滤器的影响,提出了基于计数式布鲁姆过滤器距离的集合变动定量评估算法.理论分析和仿真实验表明,该评估算法评估准确率高达90%以上.  相似文献   

5.
在内容分发网络、闲谈协议、移动数据同步等分布式系统中,远程主机上集合对称差规模的估算准确程度,直接影响基于CPISync算法的集合调和方法的消息交换轮数以及调和时间。对称差规模的估算误差越低,则集合调和的速度越快。本文提出基于布鲁姆过滤器的准交集查询法,该算法可显著降低对称差规模的估算误差,提高调和算法的效率。  相似文献   

6.
张恩  刘亚鹏 《计算机应用》2016,36(10):2723-2727
针对基于混淆布鲁姆过滤器的隐私集合比较(PSI)协议中存在参与方信息获取不对等及协议不能有效应用于云环境等问题,将混淆布鲁姆过滤器算法与代理不经意传输协议相结合,提出了一种基于混淆布鲁姆过滤器和代理不经意传输的云外包隐私集合比较协议。首先,该算法通过引入混淆布鲁姆过滤器的概念,解决了传统标准布鲁姆过滤器产生误判的问题,进而达到高效存储和传输大数据的目的;其次,采用代理不经意传输协议,能够将复杂耗时的计算外包给云代理服务器,使得云租户不需实时在线、仅需进行少量计算;最后,在云外包隐私集合比较过程中,云租户间无需交互,能够公平地得到集合比较结果。理论分析和性能对比表明,该算法的通信复杂度和计算复杂度是线性的,并且协议是安全和有效的。  相似文献   

7.
针对文件级单布鲁姆过滤器排重算法只能以文件为单位进行数据排重,数据块级单布鲁姆过滤器排重算法耗时过多的缺点,采用2个布鲁姆过滤器,创建文件级和数据块级2级数据排重的算法结构。实验结果表明,双布鲁姆过滤器排重算法可以以数据块为单位对数据排重,在保持低假阳性误判率的同时,相比数据块级单布鲁姆过滤器排重算法耗时缩短了43%~68%。  相似文献   

8.
本文提出一种基于多层次结构的树形布鲁姆过滤器TBF。多层次结构是近年来布鲁姆过滤器及相关数据结构研究的热点。这一结构使得多层次的存储方式得以实现,减轻了片上存储的负担,而且也加快了片上查找的速度。TBF是针对BloomingTree算法存在的缺陷所改进的一种更高效的算法,它能够在低于CBF的空间需求的条件下实现与CBF相同的功能。实验证明:与BloomingTree算法相比,TBF能够有效地解决BloomingTree算法在逻辑索引时的错误问题,而且比BloomingTree算法时间上更加高效:在层数不变假阳性相同条件下,查询时间平均提高13.4%;在假阳性不变层数相同条件下,插入时间平均提高17.9%,删除时间平均提高12%。  相似文献   

9.
随着网络的发展,越来越多的场景需要在不完整数据下进行近似成员查询,传统成员查询的布鲁姆过滤器不能满足上述要求。提出面向缺失数据的布鲁姆近似查询算法,先对高维不完整数据的缺失部分进行预填充,通过PCA算法,将高维数据转换到低维数据,使用局部敏感哈希函数与标准哈希函数结合的方式将低维数据存储到布鲁姆过滤器中。使用两个真实数据集验证了所提算法的功能,所提面向缺失数据的布鲁姆近似查询算法,能有效地解决存在缺失数据的近似成员查询问题。  相似文献   

10.
在分布式系统中,覆盖查询对于保持文件的完整性以及数据的一致性有重要作用。虽然布鲁姆过滤器可以支持快速的元素从属查询,但是布鲁姆过滤器只能存储和表示离散的数据集合。为此,用前缀集合表示范围规则,并提出一个前缀编码的转化函数,将每一个前缀码转化为唯一对应的二进制串。为了支持覆盖查询,将计数布鲁姆过滤器与一组链表相结合,设计一个BFrange系统来存储包含规则标识以及具体存储元素的二元组。通过BFrange进行覆盖查询,使查询时间与存储的规则个数无关,复杂度仅为O(1)。仿真实验结果验证了BFrange能实现高效和准确的覆盖查询。  相似文献   

11.
Synchronization between two sets is an important requirement for many distributed applications. A basic prerequisite is to find out which elements of set A are not in set B and vice versa. A very space efficient data structure for such membership queries that has been used a lot in networking applications is the Bloom filter. Unfortunately, the Bloom filter owes its high efficiency to the fact that there is a chance of false positives when querying the filter. This precludes the adoption of Bloom filters in applications that cannot tolerate such errors. In this paper we present an approach that augments Bloom filters with a trie-based mechanism that deterministically and efficiently finds the false positives after using the Bloom filter to synchronize two sets. We show that the added communication overhead for our approach is negligible compared to the overhead of a plain Bloom filter.  相似文献   

12.
Finding similar items in a large and unstructured dataset is a challenging task in many applications of data science, such as searching, indexing, and retrieval. With the increasing data volume and demand for real time responses, similarity search has gained much consideration. In this paper, a parallel computational approach for similarity search using Bloom filters (PCASSB) has been proposed, which uses Bloom filter for the representation of features of document and comparison with user's query. Query features are stored in integer query array (IQA), an array of integer. The PCASSB, an approximate similarity search technique, has been implemented on graphics processing unit with compute unified device architecture as the programming platform. To compute the similarity score between query and reference dataset, Dice coefficient has been used as a baseline method. The accuracy of the results generated by PCASSB is compared with the baseline method and other state‐of‐the‐art methods. The experimental results show that the proposed technique is quite effective in processing large number of text documents as it takes less computational time.  相似文献   

13.
为满足加密文件的多关键字检索需求并解决云存储模式下文件频繁更新导致用户重构安全索引的问题,提出增量式多关键字可搜索加密算法。基于新型的多元素随机布隆过滤器,使用户提交的多个查询关键字被随机哈希并混合,在随机布隆过滤器中得到快速判定。该算法保障数据与查询机密性的同时,保护用户访问模式的隐私性,支持文件的独立更新,有效避免可搜索结构的整体重构,其效率与检索准确率与同类算法相比有较大提高。实验验证了该方案的实用性。  相似文献   

14.
随着普适计算技术、定位技术、移动通讯技术的进步,移动对象数据管理技术在诸多领域中得到广泛应用。在移动对象数据管理领域中,隐私保护是一个不可忽视的问题。用户不仅期望获取高质量的服务,同时也期望能够尽量保护自身的隐私信息。研究了空间成员查询,检验在空间某区域内是否存在移动对象。所提出的BFSQ(Bloom filter-based spatial query)方法的一大特点是能够较好地保护移动数据/用户查询的隐私,同时查询结果的质量也维持在一个较高的水平。实验结果表明了新方法的高效率和有效性。  相似文献   

15.
Pervasive applications, such as natural habitat monitoring and location-based services, have attracted plenty of research interest. These applications, which deploy a lot of sensor devices to collect data from external environments, often have limited network bandwidth and battery resources. The sensors also cannot record accurate values. The uncertainty of data captured by a sensor should thus be considered for query evaluation. To this end, probabilistic queries, which consider data impreciseness and provide statistical guarantees in answers, have been recently studied.  相似文献   

16.
Privacy preserving association rule mining has been an active research area since recently. To this problem, there have been two different approaches—perturbation based and secure multiparty computation based. One drawback of the perturbation based approach is that it cannot always fully preserve individual’s privacy while achieving precision of mining results. The secure multiparty computation based approach works only for distributed environment and needs sophisticated protocols, which constrains its practical usage. In this paper, we propose a new approach for preserving privacy in association rule mining. The main idea is to use keyed Bloom filters to represent transactions as well as data items. The proposed approach can fully preserve privacy while maintaining the precision of mining results. The tradeoff between mining precision and storage requirement is investigated. We also propose δ-folding technique to further reduce the storage requirement without sacrificing mining precision and running time.  相似文献   

17.
Dimitris  Nikos  Costas   《Computers & Security》2009,28(7):578-591
Any application or service utilizing the Internet is exposed to both general Internet attacks and other specific ones. Most of the times the latter are exploiting a vulnerability or misconfiguration in the provided service and/or in the utilized protocol itself. Consequently, the employment of critical services, like Voice over IP (VoIP) services, over the Internet is vulnerable to such attacks and, on top of that, they offer a field for new attacks or variations of existing ones. Among the various threats–attacks that a service provider should consider are the flooding attacks, at the signaling level, which are very similar to those against TCP servers but have emerged at the application level of the Internet architecture. This paper examines flooding attacks against VoIP architectures that employ the Session Initiation Protocol (SIP) as their signaling protocol. The focus is on the design and implementation of the appropriate detection method. Specifically, a bloom filter based monitor is presented and a new metric, named session distance, is introduced in order to provide an effective protection scheme against flooding attacks. The proposed scheme is evaluated through experimental test bed architecture under different scenarios. The results of the evaluation demonstrate that the required time to detect such an attack is negligible and also that the number of false alarms is close to zero.  相似文献   

18.
围绕多关键字的模糊匹配和数据安全性保障问题,展开对多关键字模糊搜索方法的研究,提出一种面向多关键字的模糊密文搜索方案.该方案以布隆过滤器(Bloom filter)为基础,使用对偶编码函数和位置敏感Hash函数来对文件索引进行构建,并使用距离可恢复加密算法对该索引进行加密,实现了对多关键字的密文模糊搜索.同时方案不需要提前设置索引存储空间,从而大大降低了搜索的复杂度.除此之外,该方案与已有方案相比不需要预定义字典库,降低了存储开销.实验分析和安全分析表明,该方案不仅能够实现面向多关键字的密文模糊搜索,而且保证了方案的机密性和隐私性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号