首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
We consider the problem of efficiently computing distributed geographical k-NN queries in an unstructured peer-to-peer (P2P) system,in which each peer is managed by an individual organization and can only communicate with its logical neighboring peers.Such queries are based on local filter query statistics,and require as less communication cost as possible,which makes it more difficult than the existing distributed k-NN queries.Especially,we hope to reduce candidate peers and degrade communication cost.In this paper,we propose an efficient pruning technique to minimize the number of candidate peers to be processed to answer the k-NN queries.Our approach is especially suitable for continuous k-NN queries when updating peers,including changing ranges of peers,dynamically leaving or joining peers,and updating data in a peer. In addition,simulation results show that the proposed approach outperforms the existing Minimum Bounding Rectangle (MBR.)-based query approaches,especially for continuous queries.  相似文献   

2.
Together with advanced positioning and mobile technologies, P2P query processing has attracted a growing interest number of location-aware applications such as answering kNN queries in mobile ad hoc networks. It not only overcomes drawbacks of centralized systems, for example single point of failure and bottleneck issues, but more importantly harnesses power of peers’ collaboration. In this research, we propose a pure mobile P2P query processing scheme which primarily focuses on the search and validation algorithm for kNN queries. The proposed scheme is designed for pure mobile P2P environments with the absence of the base station support. Compared with centralized and hybrid systems, our system can reduce energy consumption more than six times by making use of data sharing from peers in a reasonable mean latency of processing time for networks with high density of moving objects as can be seen in the simulation results.  相似文献   

3.
Peer-to-Peer (P2P) systems have attracted much attention in academic commu-nity and industry circles due to their promising applications in various domains. This paper presents the authors‘ research efforts on introducing complex query capabilities in a P2P environ-ment consisting of numerous peers with large volume of data. An underlying hybrid P2P computing platform, named BestPeer is described first. The connection among peers within BestPeer is self-configurable through maintaining the nearest neighbor of peers, and the agent techniques employed in the system ensure its capability of providing sophisticated services. The designs of three P2P data management systems which are all based on BestPeer are described in detail. They provide support for information retrieval, query processing and Web services respectively. Advantages and limitations are discussed, while ongoing work is presented. Current systems can provide basic functions for keyword-based search, SQL-like query processing, and Web services querying and discovery. Some further topics on providing fully-fledged data management functionalities for P2P distributed computing systems with security guarantee are also discussed.  相似文献   

4.
纯Peer to Peer环境下有效的Top-k查询   总被引:19,自引:2,他引:19  
何盈捷  王珊  杜小勇 《软件学报》2005,16(4):540-552
目前大多数的Peer-to-Peer(P2P)系统只支持基于文件标识的搜索,用户不能根据文件的内容进行搜索.Top-k查询被广泛地应用于搜索引擎中,获得了巨大的成功.可是,由于P2P系统是一个动态的、分散的系统,在纯的P2P环境下进行top-k查询是具有挑战性的.提出了一种基于直方图的分层top-k查询算法.首先,采用层次化的方法实现分布式的top-k查询,将结果的合并和排序分散到P2P网络中的各个节点上,充分利用了网络中的资源.其次,根据节点返回的结果为节点构建直方图,利用直方图估计节点可能的分数上限,对节点进行选择,提高了查询效率.实验证明,top-k查询提高了查询效果,而直方图则提高了查询效率.  相似文献   

5.
In recent years there has been a significant interest in peer-to-peer (P2P) environments in the community of data management. However, almost all work, so far, is focused on exact query processing in current P2P data systems. The autonomy of peers also is not considered enough. In addition, the system cost is very high because the information publishing method of shared data is based on each document instead of document set. In this paper, abstract indices (AbIx) are presented to implement content-based approximate queries in centralized, distributed and structured P2P data systems. It can be used to search as few peers as possible but get as many returns satisfying users' queries as possible on the guarantee of high autonomy of peers. Also, abstract indices have low system cost, can improve the query processing speed, and support very frequent updates and the set information publishing method. In order to verify the effectiveness of abstract indices, a simulator of 10,000 peers, over 3 million documents is made, and several metrics are proposed. The experimental results show that abstract indices work well in various P2P data systems.  相似文献   

6.
Continuous processing of top-k queries over data streams is a promising technique for alleviating the information overload problem as it distinguishes relevant from irrelevant data stream objects with respect to a given scoring function over time. Thus it enables filtering of irrelevant data objects and delivery of top-k objects relevant to user interests in real-time. We propose a solution for distributed continuous top-k processing based on the publish/subscribe communication paradigm—top-k publish/subscribe over sliding windows (top-k/w publish/subscribe). It identifies k best-ranked objects with respect to a given scoring function over a sliding window of size w, and extends the publish/subscribe communication paradigm by continuous top-k processing algorithms coming from the field of data stream processing.In this paper, we introduce, analyze and evaluate the essential building blocks of distributed top-k/w publish/subscribe systems: first, we present a formal top-k/w publish/subscribe model and compare it to the prevailing Boolean publish/subscribe model. Next, we outline the top-k/w processing tasks performed by publish/subscribe nodes and investigate the properties of supported scoring functions. Furthermore, we explore potential routing strategies for distributed top-k/w publish/subscribe systems. Finally, we experimentally evaluate model properties and provide a comparative study investigating traffic requirements of potential routing strategies.  相似文献   

7.
Preference query processing is important for a wide range of applications involving distributed databases, such as network monitoring, web-based systems, and market analysis. In such applications, data objects are generated frequently and massively, which presents an important and challenging problem of continuous query processing over distributed data stream environments. A top-k dominating query, which has been receiving much research attention recently, returns the k data objects that dominate the highest number of data objects in a given dataset, and due to its dominance-based ranking function, we can easily obtain superior data objects. An emerging requirement in distributed stream environments is an efficient technique for continuously monitoring top-k dominating data objects. Despite of this fact, no study has addressed this problem. In this paper, therefore, we address the problem of continuous top-k dominating query processing over distributed data stream environments. We present two algorithms that monitor the exact top-k dominating data and efficiently eliminate unqualified data objects for the result, which reduces both communication and computation costs. In addition to these algorithms, we present an approximate algorithm that further reduces both communication and computation costs. Extensive experiments on both synthetic and real data have demonstrated the efficiency and scalability of our algorithms.  相似文献   

8.
Unstructured Peer-to-Peer (P2P) networks have become a very popular architecture for content distribution in large-scale and dynamic environments. Searching for content in unstructured P2P networks is a challenging task because the distribution of objects has no association with the organization of peers. Proposed methods in recent years either depend too much on objects replication rate or suffer from a sharp decline in performance when objects stored in peers change rapidly, although their performance is better than flooding or random walk algorithms to some extent. In this paper, we propose a novel query routing mechanism for improving query performance in unstructured P2P networks. We design a data structure called traceable gain matrix (TGM) that records every query's gain at each peer along the query hit path, and allows for optimizing query routing decision effectively. Experimental results show that our query routing mechanism achieves relatively high query hit rate with low bandwidth consumption in different types of network topologies under static and dynamic network conditions.  相似文献   

9.
一种高效的P2P环境中的窗口查询算法   总被引:1,自引:0,他引:1  
随着多媒体以及P2P网络的发展,针对高维数据基于属性的窗口查询已经成为一个重要研究课题.提出了一种在超级节点P2P网络中有效解决高维数据的窗口查询算法,在每个单独的网络节点上,数据通过一种降维算法映射到一维空间,在超级节点上,构造数据的统计信息表以及构造网络查询树,算法在每次查询时,按照查询树的规则来访问整个网络,并利用统计信息剪枝网络中的节点查询,避免网络的泛洪.实验中使用了不同的数据集来评测算法的查询效率,结果表明该算法具有很高的查询效率.  相似文献   

10.
This work introduces decentralized query processing techniques based on MIDAS, a novel distributed multidimensional index. In particular, MIDAS implements a distributed k-d tree, where leaves correspond to peers, and internal nodes dictate message routing. MIDAS requires that peers maintain little network information, and features mechanisms that support fault tolerance and load balancing. The proposed algorithms process point and range queries over the multidimensional indexed space in only O(log n) hops in expectance, where n is the network size. For nearest neighbor queries, two processing alternatives are discussed. The first, termed eager processing, has low latency (expected value of O(log n) hops) but may involve a large number of peers. The second, termed iterative processing, has higher latency (expected value of O(log2 n) hops) but involves far fewer peers. A detailed experimental evaluation demonstrates that our query processing techniques outperform existing methods for settings involving real spatial data as well as in the case of high dimensional synthetic data.  相似文献   

11.
徐林昊  钱卫宁  周傲英 《软件学报》2007,18(6):1443-1455
对等计算数据管理中的一个重要问题是如何有效地支持多维数据空间上的相似性搜索.现有的非结构化对等计算数据共享系统仅支持简单的查询处理方法,即匹配查询处理.将近似技术和路由索引结合在一起,设计了一种简单、有效的索引结构EVARI(扩展近似向量路由索引).利用EVARI,每个节点不仅可以在本地共享的数据集上处理范围查询,而且还可以将查询转发给最有希望获得查询结果的邻居节点.为了建立EVARI,每个节点使用空间划分技术概括本地的共享内容,并与邻居节点交换概要信息.而且,每个节点都可以重新配置自己的邻居节点,使得相关节点位置相互邻近,优化了系统资源配置,提升了系统性能.仿真实验证明了该方法的良好性能.  相似文献   

12.
Peer-to-Peer (P2P) computing has recently attracted a great deal of research attention. In a P2P system, a large number of nodes can potentially be pooled together to share their resources, information, and services. However, existing unstructured P2P systems lack support for content-based search over data objects which are generally represented by high-dimensional feature vectors. In this paper, we propose an efficient and effective indexing mechanism to facilitate high-dimensional similarity query in unstructured P2P systems, named Linking Identical Neighborly Partitions (LINP), which combines both space partitioning technique and routing index technique. With the aid of LINP, each peer can not only process similarity query efficiently over its local data, but also can route the query to the promising peers which may contain the desired data. In the proposed scheme, each peer summarizes its local data using the space partitioning technique, and exchanges the summarized index with its neighboring peers to construct routing indices. Furthermore, to improve the system performance with peer updates, we propose an extension of the LINP, named LINP+, where each peer can reconfigure its neighboring peers to keep relevant peers nearby. The performance of our proposed scheme is evaluated over both synthetic and real-life high-dimensional datasets, and experimental results show the superiority of our proposed scheme.  相似文献   

13.
Distributed skyline computation is important for a wide range of domains, from distributed and web-based systems to ISP-network monitoring and distributed databases. The problem is particularly challenging in dynamic distributed settings, where the goal is to efficiently monitor a continuous skyline query over a collection of distributed streams. All existing work relies on the assumption of a single point of reference for object attributes/dimensions: objects may be vertically or horizontally partitioned, but the accurate value of each dimension for each object is always maintained by a single site. This assumption is unrealistic for several distributed applications, where object information is fragmented over a set of distributed streams (each monitored by a different site) and needs to be aggregated (e.g., averaged) across several sites. Furthermore, it is frequently useful to define skyline dimensions through complex functions over the aggregated objects, which raises further challenges for dealing with distribution and object fragmentation. We present the first known distributed algorithms for continuous monitoring of skylines over complex functions of fragmented multi-dimensional objects. Our algorithms rely on decomposition of the skyline monitoring problem to a select set of distributed threshold-crossing queries, which can be monitored locally at each site. We propose several optimizations, including: (a) a technique for adaptively determining the most efficient monitoring strategy for each object, (b) an approximate monitoring technique, and (c) a strategy that reduces communication overhead by grouping together threshold-crossing queries. Furthermore, we discuss how our proposed algorithms can be used to address other continuous query types. A thorough experimental study with synthetic and real-life data sets verifies the effectiveness of our schemes and demonstrates order-of-magnitude improvements in communication costs compared to the only alternative centralized solution.  相似文献   

14.
The increasing use of mobile communications has raised many issues of decision support and resource allocation. A crucial problem is how to solve queries of Reverse Nearest Neighbour (RNN). An RNN query returns all objects that consider the query object as their nearest neighbour. Existing methods mostly rely on a centralised base station. However, mobile P2P systems offer many benefits, including self-organisation, fault-tolerance and load-balancing. In this study, we propose and evaluate 3 distinct P2P algorithms focusing on bichromatic RNN queries, in which mobile query peers and static objects of interest are of two different categories, based on a time-out mechanism and a boundary polygon around the mobile query peers. The Brute-Force Search Algorithm provides a naive approach to exploit shared information among peers whereas two other Boundary Search Algorithms filter a number of peers involved in query processing. The algorithms are evaluated in the MiXiM simulation framework with both real and synthetic datasets. The results show the practical feasibility of the P2P approach for solving bichromatic RNN queries for mobile networks.  相似文献   

15.
Enterprise Communication Systems are designed in such a way to maximise the efficiency of communication and collaboration within the enterprise. With users becoming mobile, the Internet of Things (IoT) can play a crucial role in this process, but is far from being seamlessly integrated into modern online communications. In this paper, we present a semantic infrastructure for gathering, integrating and reasoning upon heterogeneous, distributed and continuously changing data streams by means of semantic technologies and rule-based inference. Our solution exploits semantics to go beyond today’s ad-hoc integration and processing of heterogeneous data sources for static and streaming data. It provides flexible and efficient processing techniques that can transform low-level data into high-level abstractions and actionable knowledge, bridging the gap between IoT and online Enterprise Communication Systems. We document the technologies used for acquisition and semantic enrichment of sensor data, continuous semantic query processing for integration and filtering, as well as stream reasoning for decision support. Our main contributions are the following, (i) we define and deploy a semantic processing pipeline for IoT-enabled Communication Systems, which builds upon existing systems for semantic data acquisition, continuous query processing and stream reasoning, detailing the implementation of each component of our framework; (ii) we present a rich semantic information model for representing and linking IoT data, social data and personal data in the Enterprise Communication scenario, by reusing and extending existing standard semantic models; (iii) we define and develop an expressive stream reasoning component as part of our framework, based on continuous query processing and non-monotonic reasoning for semantic streams, (iv) we conduct experiments to comparatively evaluate the performance of our data acquisition and semantic annotation layer based on OpenIoT, and the performance of our expressive reasoning layer in the scenario of Enterprise Communication.  相似文献   

16.
One of the key challenges in a peer-to-peer (P2P) network is to efficiently locate relevant data sources across a large number of participating peers. With the increasing popularity of the extensible markup language (XML) as a standard for information interchange on the Internet, XML is commonly used as an underlying data model for P2P applications to deal with the heterogeneity of data and enhance the expressiveness of queries. In this paper, we address the problem of efficiently locating relevant XML documents in a P2P network, where a user poses queries in a language such as XPath. We have developed a new system called psiX that runs on top of an existing distributed hashing framework. Under the psiX system, each XML document is mapped into an algebraic signature that captures the structural summary of the document. An XML query pattern is also mapped into a signature. The query's signature is used to locate relevant document signatures. Our signature scheme supports holistic processing of query patterns without breaking them into multiple path queries and processing them individually. The participating peers in the network collectively maintain a collection of distributed hierarchical indexes for the document signatures. Value indexes are built to handle numeric and textual values in XML documents. These indexes are used to process queries with value predicates. Our experimental study on PlanetLab demonstrates that psiX provides an efficient location service in a P2P network for a wide variety of XML documents.  相似文献   

17.
自治异构数据源聚集模型与算法研究   总被引:1,自引:0,他引:1  
自治异构数据源信息共享的主要问题是如何在P2P环境下对自治数据节点的信息进行统一访问.采用分层结构组织数据源节点能够提高查询效率,减小计算开销,但需要节点根据彼此相似度实现局部的聚类.给出了数据源节点信息发布的形式化描述,提出了基于模式元素匹配的自治异构数据源多重聚集模型以及聚类组织构建过程,采用TA算法解决top-K聚类节点搜索问题,并在此基础上提出TAL算法.实验结果表明,TA和TAL算法能够高效地解决节点聚类排序的问题,特别是TAL算法在聚类节点范围较大时计算性能优于TA.  相似文献   

18.
Efficient monitoring of skyline queries over distributed data streams   总被引:1,自引:0,他引:1  
Data management and data mining over distributed data streams have received considerable attention within the database community recently. This paper is the first work to address skyline queries over distributed data streams, where streams derive from multiple horizontally split data sources. Skyline query returns a set of interesting objects which are not dominated by any other objects within the base dataset. Previous work is concentrated on skyline computations over static data or centralized data streams. We present an efficient and an effective algorithm called BOCS to handle this issue under a more challenging environment of distributed streams. BOCS consists of an efficient centralized algorithm GridSky and an associated communication protocol. Based on the strategy of progressive refinement in BOCS, the skyline is incrementally computed by two phases. In the first phase, local skylines on remote sites are maintained by GridSky. At each time, only skyline increments on remote sites are sent to the coordinator. In the second phase, a global skyline is obtained by integrating remote increments with the latest global skyline. A theoretical analysis shows that BOCS is communication-optimal among all algorithms which use a share-nothing strategy. Extensive experiments demonstrate that our proposals are efficient, scalable, and stable.  相似文献   

19.
Recently, a number of query processors has been proposed for the evaluation of relational queries in structured P2P systems. However, as these approaches do not consider peer or link failures, they cannot be deployed without extensions for real-world applications. We show that typical failures in structured P2P systems can have an unpredictable impact on the correctness of the result. In particular stateful operators that store intermediate results on peers, e.g., the distributed hash join, must protect such results against failures. Although many replication schemes for P2P systems exist, they cannot replicate operator states while the query is processed. In this paper we propose an in-query replication scheme which replicates the state of an operator among the neighbors of the processing peer. Our analytical evaluation shows that the network overhead of the in-query replication is in O(1) regarding network size, i.e., our scheme is scalable. We have carried out an extensive experimental evaluation using simulations as well as a PlanetLab deployment. It confirms the effectiveness and the efficiency of the in-query replication scheme and shows the effectiveness of the routing extension in networks of varying reliability.  相似文献   

20.
Sharing structured data in a P2P network is a challenging problem, especially in the absence of a mediated schema. The standard practice of answering a consecutively rewritten query along the propagation path often results in significant loss of information. On the opposite, the use of mediated schemas requires human interaction and global agreement, both during creation and maintenance. In this paper we present GrouPeer, an adaptive, automated approach to both issues in the context of unstructured P2P database overlays. By allowing peers to individually choose which rewritten version of a query to answer and evaluate the received answers, information-rich sources left hidden otherwise are discovered. Gradually, the overlay is restructured as semantically similar peers are clustered together. Experimental results show that our technique produces very accurate answers and builds clusters that are very close to the optimal ones by contacting a very small number of nodes in the overlay.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号