首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
With the increasing popularity of the peer-to-peer (P2P) computing paradigm, many general range query schemes for distributed hash table (DHT)-based P2P systems have been proposed in recent years. Although those schemes can provide range query capability without modifying the underlying DHTs, they have the query delay depending on both the scale of the system and the size of the query space or the specific query, and thus cannot guarantee to return the query results in a bounded delay. In this paper, we propose Armada, an efficient range query processing scheme to support delay-bounded single-attribute and multiple-attribute range queries. It is the first delay-bounded general range query scheme on constant-degree DHTs, and can return the results for any range query within 2logN hops in a P2P system with N peers. Results of analysis and simulations show that the average delay in Armada is less than logN, and the average message cost of single-attribute range queries is about logN+2n 2 (n is the number of peers that intersect with the query). These results are very close to the lower bounds on delay and message cost of range queries over constant-degree DHTs.  相似文献   

2.
Distributed hash tables (DHTs) are very efficient for querying based on key lookups. However, building huge term indexes, as required for IR-style keyword search, poses a scalability challenge for plain DHTs. Due to the large sizes of document term vocabularies, peers joining the network cause huge amounts of key inserts and, consequently, a large number of index maintenance messages. Thus, the key to exploiting DHTs for distributed information retrieval is to reduce index maintenance costs. Various approaches in this direction have been pursued, including the use of hybrid infrastructures, or changing the granularity of the inverted index to peer level. We show that indexing costs can be significantly reduced further by letting peers form groups in a self-organized fashion. Instead of each individual peer submitting index information separately, all peers of a group cooperate to publish the index updates to the DHT in batches. Our evaluation shows that this approach reduces index maintenance cost by an order of magnitude, while still keeping a complete and correct term index for query processing.  相似文献   

3.
徐林昊  钱卫宁  周傲英 《软件学报》2007,18(6):1443-1455
对等计算数据管理中的一个重要问题是如何有效地支持多维数据空间上的相似性搜索.现有的非结构化对等计算数据共享系统仅支持简单的查询处理方法,即匹配查询处理.将近似技术和路由索引结合在一起,设计了一种简单、有效的索引结构EVARI(扩展近似向量路由索引).利用EVARI,每个节点不仅可以在本地共享的数据集上处理范围查询,而且还可以将查询转发给最有希望获得查询结果的邻居节点.为了建立EVARI,每个节点使用空间划分技术概括本地的共享内容,并与邻居节点交换概要信息.而且,每个节点都可以重新配置自己的邻居节点,使得相关节点位置相互邻近,优化了系统资源配置,提升了系统性能.仿真实验证明了该方法的良好性能.  相似文献   

4.
Distributed hash tables (DHTs) excel at exact-match lookups, but they do not directly support complex queries such as semantic search that is based on content. In this paper, we propose a novel approach to efficient semantic search on DHT overlays. The basic idea is to place indexes of semantically close files into same peer nodes with high probability by exploiting information retrieval algorithms and locality sensitive hashing. A query for retrieving semantically close files is answered with high recall by consulting only a small number (e.g., 10–20) of nodes that stores the indexes of the files semantically close to the query. Our approach adds only index information to peer nodes, imposing only a small storage overhead. Via detailed simulations, we show that our approach achieves high recall for queries at very low cost, i.e., the number of nodes visited for a query is about 10–20, independent of the overlay size.  相似文献   

5.
Accompanying the growth of the Internet, computers throughout the world can connect to each other and exchange information, increasing the convenience and efficiency of information-based work. The advent of data-sharing applications, such as Napster and Gnutella, has made peer-to-peer (P2P) systems popular for widespread exchange of resources and voluminous information between millions of users. In recent years, research issues associated with P2P systems have been discussed widely. To resolve the file-availability problem and improve the workload, a method called the Distributed Hash Table (DHT) has been proposed. However, DHT-based systems in structured architectures cannot support efficient queries, such as a similarity query, range query, and partial-match query, due to the characteristics of the hash function. This study presents a novel scheme that supports filename partial-matches in structured P2P systems. The proposed approach supports complex queries and guarantees result quality. Experimental results demonstrate the effectiveness of the proposed approach.  相似文献   

6.
Recently, peer-to-peer (P2P) search technique has become popular in the Web as an alternative to centralized search due to its high scalability and low deployment-cost. However, P2P search systems are known to suffer from the problem of peer dynamics, such as frequent node join/leave and document changes, which cause serious performance degradation. This paper presents the architecture of a P2P search system that supports full-text search in an overlay network with peer dynamics. This architecture, namely HAPS, consists of two layers of peers. The upper layer is a DHT (distributed hash table) network interconnected by some super peers (which we refer to as hubs). Each hub maintains distributed data structures called search directories, which could be used to guide the query and to control the search cost. The bottom layer consists of clusters of ordinary peers (called providers), which can receive queries and return relevant results. Extensive experimental results indicate that HAPS can perform searches effectively and efficiently. In addition, the performance comparison illustrates that HAPS outperforms a flat structured system and a hierarchical unstructured system in the environment with peer dynamics.  相似文献   

7.
Peer knowledge management systems (PKMS) offer a flexible architecture for decentralized knowledge sharing. In PKMSs, the knowledge sharing and evolution processes are based on peer ontologies. Finding an effective and efficient query rewriting algorithm for regular expression queries is vital for knowledge sharing between peers in PKMSs; and for this our solution is characterized by graph-based query rewriting. Based on the graphs for both axioms and mappings, we design a novel algorithm, regular expression rewriting algorithm, to rewrite regular expression queries along semantic paths. The simulation results show that the performance of our algorithm is better than Mork’s reformulation algorithms [P. Mork, Peer architectures for knowledge sharing, PhD thesis, University of Washington, 2005. <http://www.mitre.org/staffpages/pmork/>], and our algorithm is more effective than the naive rewriting algorithm.  相似文献   

8.
We proposed to utilize the scalable peer-to-peer network to perform the content-based image retrieval and mining, i.e, P2P-CBIRM. The decentralized unstructured P2P model with certain overheads, i.e., peer clustering and update procedures, is adopted to compromise with the structured one while still reserving flexible routing control when peers join/leave or network fails. The peer CBIRM engine is designed to utilize multi-instance query with multi-feature types to effectively reduce network traffic while maintaining high retrieval accuracy. It helps to enhance the knowledge discovery and image data mining capability. The proposed P2P-CBIRM system provides the scalable retrieval and mining function that the query scope and retrieval accuracy can be adaptively and progressively controlled. To improve the query efficiency (recall-rate/query-scope), it effectively utilizes both: 1) forwarding query message (forward phase) to reduce the query scope and 2) transmitting retrieval results (backward phase) such that activated peers keep filtering high similarity images on the link-path toward the query peer. Experiments show that the query efficiency of the scalable retrieval approach is better than previous methods, i.e., firework query model and breadth-first search. It provides a scalable knowledge discovery platform for efficient image data mining applications. We also proposed to optimally configure the P2P-CBIRM system such that, under a certain number of online users, it would yield the highest recall rate. Simulations demonstrate that, with the optimal configuration, recall rates can be improved to 2.5 to 3 times larger while the network traffic of each peer is reduced to 30% of the original, under the same number of on-line users.  相似文献   

9.
By combining an unstructured protocol with a DHT-based index, hybrid Peer-to-Peer (P2P) improves search efficiency in terms of query recall and response time. The key challenge in hybrid search is to estimate the number of peers that can answer a given query. Existing approaches assume that such a number can be directly obtained by computing item popularity. In this work, we show that such an assumption is not always valid, and previous designs cannot distinguish whether items related to a query are distributed in many peers or are in a few peers. To address this issue, we propose QRank, a difficulty-aware hybrid search, which ranks queries by weighting keywords based on term frequency. Using rank values, QRank selects proper search strategies for queries. We conduct comprehensive trace-driven simulations to evaluate this design. Results show that QRank significantly improves the search quality as well as reducing system traffic cost compared with existing approaches.  相似文献   

10.
This paper proposes a two-level P2P caching strategy for Web search queries. The design is suitable for a fully distributed service platform based on managed peer boxes (set-top-box or DSL/cable modem) located at the edge of the network, where both boxes and access bandwidth to those boxes are controlled and managed by an ISP provider. Our solution significantly reduces user query traffic going outside of the ISP provider to get query results from the respective Web search engine. Web users are usually very reactive to worldwide events which cause highly dynamic query traffic patterns leading to load imbalance across peers. Our solution contains a strategy to quickly ease imbalance on peers and spread communication flow among participating peers. Each peer maintains a local result cache used to keep the answers for queries originated in the peer itself and queries for which the peer is responsible for by contacting the Web search engine on-demand. When query traffic is predominantly routed to a few responsible peers our strategy replicates the role of “being responsible for” to neighboring peers so that they can absorb query traffic. This is a fairly slow and adaptive process that we call mid-term load balancing. To achieve a short-term fair distribution of queries we introduce a location cache in each peer which keeps pointers to peers that have already requested the same queries in the recent past. This lets these peers share their query answers with newly requesting peers. This process is fast as these popular queries are usually cached in the first DHT hop of a requesting peer which quickly tends to redistribute load among more and more peers.  相似文献   

11.
Recent progress in peer to peer (P2P) search algorithms has presented viable structured and unstructured approaches for full-text search. We posit that these existing approaches are each best suited for different types of queries. We present PHIRST, the first system to facilitate effective full-text search within P2P databases. PHIRST works by effectively leveraging between the relative strengths of these approaches. Similar to structured approaches, agents first publish terms within their stored documents. However, frequent terms are quickly identified and not exhaustively stored, resulting in a significant reduction in the system's storage requirements. During query lookup, agents use unstructured search to compensate for the lack of fully published terms. Additionally, they explicitly weigh between the costs involved in structured and unstructured approaches, allowing for a significant reduction in query costs. Finally, we address how node failures can be effectively addressed through storing multiple copies of selected data. We evaluated the effectiveness of our approach using both real-world and artificial queries. We found that in most situations our approach yields near perfect recall. We discuss the limitations of our system, as well as possible compensatory strategies.  相似文献   

12.
We consider the problem of efficiently computing distributed geographical k-NN queries in an unstructured peer-to-peer (P2P) system,in which each peer is managed by an individual organization and can only communicate with its logical neighboring peers.Such queries are based on local filter query statistics,and require as less communication cost as possible,which makes it more difficult than the existing distributed k-NN queries.Especially,we hope to reduce candidate peers and degrade communication cost.In this paper,we propose an efficient pruning technique to minimize the number of candidate peers to be processed to answer the k-NN queries.Our approach is especially suitable for continuous k-NN queries when updating peers,including changing ranges of peers,dynamically leaving or joining peers,and updating data in a peer. In addition,simulation results show that the proposed approach outperforms the existing Minimum Bounding Rectangle (MBR.)-based query approaches,especially for continuous queries.  相似文献   

13.
SSW: A Small-World-Based Overlay for Peer-to-Peer Search   总被引:2,自引:0,他引:2  
Peer-to-peer (P2P) systems have become a popular platform for sharing and exchanging voluminous information among thousands or even millions of users. The massive amount of information shared in such systems mandates efficient semantic-based search instead of key-based search. The majority of existing proposals can only support simple key-based search rather than semantic-based search. This paper presents the design of an overlay network, namely, semantic small world (SSW), that facilitates efficient semantic-based search in P2P systems. SSW achieves the efficiency based on four ideas: 1) semantic clustering, where peers with similar semantics organize into peer clusters, 2) dimension reduction, where to address the high maintenance overhead associated with capturing high-dimensional data semantics in the overlay, peer clusters are adaptively mapped to a one-dimensional naming space, 3) small world network, where peer clusters form into a one-dimensional small world network, which is search efficient with low maintenance overhead, and 4) efficient search algorithms, where peers perform efficient semantic-based search, including approximate point query and range query in the proposed overlay. Extensive experiments using both synthetic data and real data demonstrate that SSW is superior to the state of the art on various aspects, including scalability, maintenance overhead, adaptivity to distribution of data and locality of interest, resilience to peer failures, load balancing, and efficiency in support of various types of queries on data objects with high dimensions.  相似文献   

14.
In the past few years, peer-to-peer (P2P) networks have become a promising paradigm for building a wide variety of distributed systems and applications. The most popular P2P application till today is file sharing, e.g., Gnutella, Kazza, etc. These systems are usually referred to as unstructured, and search in unstructured P2P networks usually involves flooding or random walking. On the other hand, in structured P2P networks (DHTs), search is usually performed by looking up a distributed inverted index. The efficiency of the search mechanism is the key to the scalability of a P2P content sharing system. So far, neither unstructured nor structured P2P networks alone can solve the search problem in a satisfactory way. In this paper, we propose to combine the strengths of both unstructured and structured P2P networks to achieve more efficient search. Specifically, we propose to enhance search in unstructured P2P overlay networks by building a partial index of shared data using a structured P2P network. The index maintains two types of information: the top interests of peers and globally unpopular data, both characterized by data properties. The proposed search protocol, assisted search with partial indexing, makes use of the index to improve search in three ways: first, the index assists peers to find other peers with similar interests and the unstructured search overlay is formed to reflect peer interests. Second, the index also provides search hints for those data difficult to locate by exploring peer interest locality, and these hints can be used for second-chance search. Third, the index helps to locate unpopular data items. Experiments based on a P2P file sharing trace show that the assisted search with a lightweight partial indexing service can significantly improve the success rate in locating data than Gnutella and a hit-rate-based protocol in unstructured P2P systems, while incurring low search latency and overheads.  相似文献   

15.
Recently, a number of query processors has been proposed for the evaluation of relational queries in structured P2P systems. However, as these approaches do not consider peer or link failures, they cannot be deployed without extensions for real-world applications. We show that typical failures in structured P2P systems can have an unpredictable impact on the correctness of the result. In particular stateful operators that store intermediate results on peers, e.g., the distributed hash join, must protect such results against failures. Although many replication schemes for P2P systems exist, they cannot replicate operator states while the query is processed. In this paper we propose an in-query replication scheme which replicates the state of an operator among the neighbors of the processing peer. Our analytical evaluation shows that the network overhead of the in-query replication is in O(1) regarding network size, i.e., our scheme is scalable. We have carried out an extensive experimental evaluation using simulations as well as a PlanetLab deployment. It confirms the effectiveness and the efficiency of the in-query replication scheme and shows the effectiveness of the routing extension in networks of varying reliability.  相似文献   

16.
纯Peer to Peer环境下有效的Top-k查询   总被引:21,自引:2,他引:19  
何盈捷  王珊  杜小勇 《软件学报》2005,16(4):540-552
目前大多数的Peer-to-Peer(P2P)系统只支持基于文件标识的搜索,用户不能根据文件的内容进行搜索.Top-k查询被广泛地应用于搜索引擎中,获得了巨大的成功.可是,由于P2P系统是一个动态的、分散的系统,在纯的P2P环境下进行top-k查询是具有挑战性的.提出了一种基于直方图的分层top-k查询算法.首先,采用层次化的方法实现分布式的top-k查询,将结果的合并和排序分散到P2P网络中的各个节点上,充分利用了网络中的资源.其次,根据节点返回的结果为节点构建直方图,利用直方图估计节点可能的分数上限,对节点进行选择,提高了查询效率.实验证明,top-k查询提高了查询效果,而直方图则提高了查询效率.  相似文献   

17.
A desired P2P file sharing system is expected to achieve the following design goals: scalability, routing efficiency and complex query support. In this paper, we propose a powerful P2P file sharing system, PSON, which can satisfy all the three desired properties. PSON is essentially a semantic overlay network of logical nodes. Each logical node represents a cluster of peers that are close to each other. A powerful peer is selected in each cluster to support query routing on the overlay network while the less powerful peers are responsible for the maintenance of shared contents. To facilitate query routing, super peers are organized in form of a balanced binary search tree. By exploiting the concept of semantics, PSON can support complex queries in a scalable and efficient way. In this paper, we present the basic system design such as the semantic overlay construction, query routing and system dynamics. A load balancing scheme is proposed to further enhance the system performance. By simulation experiments, we show that PSON is scalable, efficient and is able to support complex queries.  相似文献   

18.
R.  G.  S.  M.  M. 《Performance Evaluation》2005,62(1-4):1-16
This paper presents an analytical framework to study search strategies in large-scale decentralized unstructured peer-to-peer (P2P) networks. The peers comprising the P2P network and their application-level connections are modeled as generalized random graphs (GRGs) whose simple and efficient analysis is accomplished using the generating function of the graph’s degree distribution. The framework we defined allows the computation of several interesting performance indexes to be used to compare different search strategies: in particular, the average number of messages sent throughout the P2P network and the probability that a query is successful are used as examples. Furthermore, assuming that the cumulative distribution function (CDF) of the time required by a peer to positively reply to a query is known, we show how to derive the CDF of the time it takes for a randomly chosen peer to obtain at least one positive reply from other peers. The approach is validated through simulation showing that the accuracy of the proposed model improves as the size of the P2P network increases making it a suitable tool for the analysis of search strategies in large-scale systems.  相似文献   

19.
20.
When a query is posed on a centralized database, if it refers to attributes that are not defined in the database, the user is warranted to get either an error or an empty set. In contrast, when a query is posed on a peer in a P2P system and refers to attributes not found in the local database, the query should not be simply rejected if the relevant information is available at other peers. This paper proposes a query model for unstructured P2P systems to answer such queries. (a) We introduce a class of polymorphic queries, a revision of conjunctive queries by incorporating type variables to accommodate attributes not defined in the local database. (b) We define the semantics of polymorphic queries in terms of horizontal and vertical object expansions, to find attributes and tuples, respectively, missing from the local database. We show that both expansions can be conducted in a uniform framework. (c) We develop a top-K algorithm to approximately answer polymorphic queries. (d) We also provide a method to merge tuples collected from various peers, based on matching keys specified in polymorphic queries. Our experimental study verifies that polymorphic queries are able to find more sensible information than traditional queries supported by P2P systems, and that these queries can be evaluated efficiently.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号