首页 | 本学科首页   官方微博 | 高级检索  
 共查询到19条相似文献,搜索用时 78 毫秒
为提高大数据背景下面向数据流的分布式to p‐k监测的实时性和可用性,对监测多个数据流的分布式系统处理数据的过程进行研究,提出一种低内存占用的分布式to p‐k监测算法。通过使用有限的内存空间对原本杂乱分布于各节点的关键数据进行重新调整,对数据处理过程中可能遇到的各种情形进行分类,依照调整结果和分类结果指定相应的处理流程,使很大一部分数据更新操作可以不依靠网络通信,或仅依靠少量网络通信来完成,有效减少监测过程中的网络通信量,在保证监测实时性的前提下提高系统的可用性。实验结果表明,该算法是有效可行的。  相似文献   

处理分布式环境下高速数据的最大挑战在于如何利用少量网络资源输出高质量的查询结果。对面向分布式环境的最近邻查询问题进行了研究,提出了一种基于过滤器的新方法,不仅能计算精确查询结果,还能够处理五类近似查询。该方法在各个远程站点均安装了智能过滤器,并通过合理设置过滤器的范围来降低数据传输量。理论分析及基于模拟数据集合和真实数据集合的实验报告均表明新方法具有较高的性能。  相似文献   

差分隐私保护下一种精确挖掘top-k频繁模式方法   总被引:1,自引:0,他引:1  
频繁模式挖掘是分析事务数据集常用技术.然而,当事务数据集含有敏感数据时(如用户行为记录、电子病例等),直接发布频繁模式及其支持度计数会给个人隐私带来相当大的风险.对此提出了一种满足ε-差分隐私的top-k频繁模式挖掘算法DP-topkP(differentially private top-k pattern mining).该算法利用指数机制从候选频繁模式集合中挑选出top-k个携带真实支持度计数的模式;采用拉普拉斯机制产生的噪音扰动所选模式的真实支持度计数;为了增强输出模式的可用性,采用后置处理技术对top-k个模式的噪音支持度计数进行求精处理.从理论角度证明了该算法满足ε-差分隐私,并符合(λ,δ)-useful要求.实验结果证明了DP-topkP算法具有较好的准确性、可用性和可扩展性.  相似文献   

在分布式数据流中,数据流之间相关性分析可以揭示被监测对象之间存在的内在联系。提出了一个基于基窗口的相关系数的计算方法,该方法先将计算相关系数的公式变形为由适合基窗口聚集的因子组成,然后用基于基窗口的方法聚集每个因子。基于基窗口的聚集方法是将窗口中的数据项划分成一系列基窗口并分别对基窗口进行计算。当窗口随机滑动后,新窗口中数据项的聚集可以部分地利用上一次窗口聚集的结果。模拟实验表明,与每次对窗口中所有数据进行聚集相比,基于基窗口的方法可以有效地降低数据流相关系数的计算时间。  相似文献   

随着通信技术和硬件设备的不断发展,尤其是小型无线传感设备的广泛应用,数据采集和生成技术变得越来越便捷和趋于自动化,研究人员正面临着如何管理和分析大规模动态数据集的问题。能够产生数据流的领域应用已经非常普通,例如传感器网络、金融证券管理、网络监控、Web日志以及通信数据在线分析等新型应用。这些应用的特征是环境配备有多个分布式计算节点;这些节点往往临近于数据源;分析和监控这种环境下的数据,往往需要对挖掘任务、数据分布、数据流入速率和挖掘方法有一定的了解。综述了分布式数据流挖掘的当前进展概况,并展望了未来可能的、潜在的专题研究方向。  相似文献   

大规模分布式监控系统面临着数据管理规模和资源约束之间的矛盾,通过预测模型方法可以有效降低网络通信开销.在定义描述问题域和分析相关工作基础上,提出了两个改进的预测模型,并给出了当预测失败时对应的调整策略.采用了模拟数据和TAO(tropical atmosphere ocean)测量的海洋表面空气真实温度数据作为实验数据流,对改进模型进行了实验.理论分析和实验结果均表明,改进后的模型具有更高的预测命中率和更低的网络通信开销.  相似文献   

分布式复式数据流的处理   总被引:3,自引:1,他引:3  
在分布式数据流环境中,系统的通信带宽是一种瓶颈资源.在保证查询精度的前提下,为了有效地减少网络中数据流的传输量,提出了一种新的数据流传输方式,称为复式数据流.复式数据流方法是将分布式数据流系统中的原始数据流分组合并成复式数据流之后再进行传输.在定义了复式数据流的基础上,给出了复式数据流的生成算法,并且分析了基于复式数据流的查询操作的误差度,讨论了构造复式数据流的相关问题,最后通过实验验证了这种方法的有效性.  相似文献   

一种分布式环境下基于角色的访问控制模型   总被引:1,自引:0,他引:1       下载免费PDF全文
针对访问控制模型在分布式系统下的局限性,提出一种分布式系统下的基于角色的访问控制模型。该模型以传统RBAC为基础,对其进行了扩展,一方面通过将角色扩展为职能角色和任务角色,另一方面为任务角色增加一个属性,用以标识该角色所赋予的主体属于本域还是外域,避免了采用对等角色直接进行角色分配的简单化处理。从而一方面有利于最小权限的实现,另一方面实现了对本域和外域的主体访问请求采用不同的策略,使基于角色的控制应用范围从集中式的控制领域扩展到分布式的控制领域,以适应不断发展的分布式环境系统的需求。  相似文献   

目前大多数信任证搜集技术采用传统的信任协商方法收集信任证,这给信任服务器带来很大负载且存在信任证盲目搜索问题。本文介绍了一个信任分布式证明协商算法DPN。基于RTP策略语言,DPN能够智能地对信任关系进行远程证明或本地推演,从而能够提高信任建立的效率。DPN能够给出证明协商过程中的相关信任规则纪录,支持对信任建立过程的验证。分析了算法的正确性和完整性,并通过实验证明了算法带来的性能提升。  相似文献   

分析了P2P中节点资源分布特点。根据搜索条件,在资源匹配度的基础上提出了节点匹配度的概念。基于节点匹配度与资源的smallworld分布特征提出top-k资源的搜索、评价算法。该算法使搜索能够在整个网络内进行,并朝资源匹配高的范围传播。在提高搜索效率、节约网络带宽的同时,保证了最终获取的k个资源是最匹配的。根据搜索条件选择广播匹配节点的方法有效地平衡了搜索、评价的带宽和计算资源。  相似文献   

提出了一种新的算法,来解决在分布式的环境中top-k求解问题(求出全局数值最大的前k名)。之前的研究,例如TA、TPUT、HT算法,都会消耗大量的带宽。KLEE算法虽然能够大大地减少带宽的消耗,却不能给出精确解。而提出的算法FT由于添加了一个预处理阶段并且使用了histogram bloom技术,即能有效地减少带宽的消耗,又能给出精确解。实现了FT和相关的算法,并进行了全面的比较。比较是建立在真实的数据集和根据不同情况合成的数据集的基础上的。实验结果显示FT在带宽消耗上面,相对于其他算法有很大的改进和优势。  相似文献   

Network overlays support the execution of distributed applications, hiding lower level protocols and the physical topology. This work presents DiVHA: a distributed virtual hypercube algorithm that allows the construction and maintenance of a self‐healing overlay network based on a virtual hypercube. DiVHA keeps logarithmic properties even when the number of nodes is not a power of two, presenting a scalable alternative to connect distributed resources. DiVHA assumes a dynamic fault situation, in which nodes fail and recover continuously, leaving and joining the system. The algorithm is formally specified, and the latency for detecting changes and the subsequent reconstruction of the topology is proved to be bounded. An actual overlay network based on DiVHA called HyperBone was implemented and deployed in the PlanetLab. HyperBone offers services such as monitoring and routing, allowing the execution Grid applications across the Internet. HyperBone also includes a procedure for detecting groups of stable nodes, which allowed the execution of parallel applications on a virtual hypercube built on top of PlanetLab. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

Due to the recent massive data generation, preference queries are becoming an increasingly important for users because such queries retrieve only a small number of preferable data objects from a huge multi-dimensional dataset. A top-k dominating query, which retrieves the k data objects dominating the highest number of data objects in a given dataset, is particularly important in supporting multi-criteria decision making because this query can find interesting data objects in an intuitive way exploiting the advantages of top-k and skyline queries. Although efficient algorithms for top-k dominating queries have been studied over centralized databases, there are no studies which deal with top-k dominating queries in distributed environments. The recent data management is becoming increasingly distributed, so it is necessary to support processing of top-k dominating queries in distributed environments. In this paper, we address, for the first time, the challenging problem of processing top-k dominating queries in distributed networks and propose a method for efficient top-k dominating data retrieval, which avoids redundant communication cost and latency. Furthermore, we also propose an approximate version of our proposed method, which further reduces communication cost. Extensive experiments on both synthetic and real data have demonstrated the efficiency and effectiveness of our proposed methods.  相似文献   

Weixiong Rao  Lei Chen 《World Wide Web》2011,14(5-6):545-572
Recent years witnessed the explosive growth of ??live?? web content in the World Wide Web like Weblogs, RSS feeds, and real-time news, etc. The popular usage of RSS feeds/readers enables end users to subscribe for favorite contents via input RSS URLs. However, the RSS feeds/readers architecture suffers from (i) the high bandwidth consumption issue, and (ii) limited filtering semantics. In this paper, we proposed a stateful full text dissemination scheme over structured P2Ps to address both issues. Specifically, for the semantic side, end users are allowed to subscribe for favorite contents via input keywords; for the network bandwidth side, the cooperative content polling, filtering and disseminating via DHT-based P2P overlay networks save the network bandwidth consumption. Our contributions include the novel techniques to (i) reduce the unit-publishing cost by pruning irreverent documents during the forwarding path towards destinations, and (ii) reduce the publication amount by selecting a very small number of meaningful terms. Based on real data sets, our experimental results show that the proposed scheme can significantly reduce the publishing cost with low maintenance overhead and a high document quality.  相似文献   

An efficient distributed algorithm for constructing small dominating sets   总被引:1,自引:0,他引:1  
The dominating set problem asks for a small subset D of nodes in a graph such that every node is either in D or adjacent to a node in D. This problem arises in a number of distributed network applications, where it is important to locate a small number of centers in the network such that every node is nearby at least one center. Finding a dominating set of minimum size is NP-complete, and the best known approximation is logarithmic in the maximum degree of the graph and is provided by the same simple greedy approach that gives the well-known logarithmic approximation result for the closely related set cover problem. We describe and analyze new randomized distributed algorithms for the dominating set problem that run in polylogarithmic time, independent of the diameter of the network, and that return a dominating set of size within a logarithmic factor from optimal, with high probability. In particular, our best algorithm runs in rounds with high probability, where n is the number of nodes, is one plus the maximum degree of any node, and each round involves a constant number of message exchanges among any two neighbors; the size of the dominating set obtained is within of the optimal in expectation and within of the optimal with high probability. We also describe generalizations to the weighted case and the case of multiple covering requirements. Received: January 2002 / Accepted: August 2002 RID="*" ID="*" Supported by NSF CAREER award NSF CCR-9983901 RID="*" ID="*" Supported by NSF CAREER award NSF CCR-9983901  相似文献   

Mining association rules plays an important role in data mining and knowledge discovery since it can reveal strong associations between items in databases. Nevertheless, an important problem with traditional association rule mining methods is that they can generate a huge amount of association rules depending on how parameters are set. However, users are often only interested in finding the strongest rules, and do not want to go through a large amount of rules or wait for these rules to be generated. To address those needs, algorithms have been proposed to mine the top-k association rules in databases, where users can directly set a parameter k to obtain the k most frequent rules. However, a major issue with these techniques is that they remain very costly in terms of execution time and memory. To address this issue, this paper presents a novel algorithm named ETARM (Efficient Top-k Association Rule Miner) to efficiently find the complete set of top-k association rules. The proposed algorithm integrates two novel candidate pruning properties to more effectively reduce the search space. These properties are applied during the candidate selection process to identify items that should not be used to expand a rule based on its confidence, to reduce the number of candidates. An extensive experimental evaluation on six standard benchmark datasets show that the proposed approach outperforms the state-of-the-art TopKRules algorithm both in terms of runtime and memory usage.  相似文献   

在折半循环编码算法的基础上,提出了一种增加算法初始化节点数量和松弛正向差集的对称分布式互斥请求集生成算法,使算法的时间复杂度大幅度降低,而所生成的请求集长度仍然保持(2N)1/2~2N1/2之间。  相似文献   

动态环境下的双子群PSO算法   总被引:3,自引:1,他引:3  
通过两组搜索方向相反、相互协同的主、辅子群,构造一种新的双子群粒子群优化算法.该算法扩展了种群的搜索范围,充分利用搜索域内的有用信息,在感知到环境变化时能迅速、准确地跟踪动态变化的极值.使用DF1(Dynamic Function 1)生成的复杂动态环境对该算法进行了验证,并与Eberhart提出的动态环境下的粒子群优化算法进行了比较分析.仿真结果表明了该算法的有效性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号