期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Supporting nearest neighbors query on high-dimensional data in P2P systems

LI Mei LEE Wang-Chien SIVASUBRAMANIAM Anand ZHAO Jizhong 《Frontiers of Computer Science》2008,2(3):234

Peer-to-peer systems have been widely used for sharing and exchanging data and resources among numerous computer nodes. Various data objects identifiable with high dimensional feature vectors, such as text, images, genome sequences, are starting to leverage P2P technology. Most of the existing works have been focusing on queries on data objects with one or few attributes and thus are not applicable on high dimensional data objects. In this study, we investigate K nearest neighbors query (KNN) on high dimensional data objects in P2P systems. Efficient query algorithm and solutions that address various technical challenges raised by high dimensionality, such as search space resolution and incremental search space refinement, are proposed. An extensive simulation using both synthetic and real data sets demonstrates that our proposal efficiently supports KNN query on high dimensional data in P2P systems. 相似文献

2.

Efficient KNN search by linear projection of image clusters

Zaher Al Aghbari Ayoub Al‐Hamadi 《国际智能系统杂志》2011,26(9):844-865

K‐nearest neighbors (KNN) search in a high‐dimensional vector space is an important paradigm for a variety of applications. Despite the continuous efforts in the past years, algorithms to find the exact KNN answer set at high dimensions are outperformed by a linear scan method. In this paper, we propose a technique to find the exact KNN image objects to a given query object. First, the proposed technique clusters the images using a self‐organizing map algorithm and then it projects the found clusters into points in a linear space based on the distances between each cluster and a selected reference point. These projected points are then organized in a simple, compact, and yet fast index structure called array‐index. Unlike most indexes that support KNN search, the array‐index requires a storage space that is linear in the number of projected points. The experiments show that the proposed technique is more efficient and robust to dimensionality as compared to other well‐known techniques because of its simplicity and compactness. © 2011 Wiley Periodicals, Inc. 相似文献

3.

An Effective P2P Search Scheme to Exploit File Sharing Heterogeneity

Chen Wang Li Xiao 《Parallel and Distributed Systems, IEEE Transactions on》2007,18(2):145-157

Although the original intent of the peer-to-peer (P2P) concept is to treat each participant equally, heterogeneity widely exists in deployed P2P networks. Peers are different from each other in many aspects, such as bandwidth, CPU power, and storage capacity. Some approaches have been proposed to take advantage of the query forwarding heterogeneity such that the high bandwidth of powerful nodes can be fully utilized to maximize the system capacity. In this paper, we suggest using the query answering heterogeneity to directly improve the search efficiency of P2P networks. In our proposed differentiated search (DiffSearch) algorithm, the peers with high query answering capabilities will have higher priority to be queried. Because the query answering capabilities are extremely unbalanced among peers, a high query success rate can be achieved by querying only a small portion of a network. The search traffic is significantly reduced due to the shrunken search space. Our trace analysis and simulation show that the DiffSearch algorithm can save up to 60 percent of search traffic 相似文献

4.

Visible Reverse k-Nearest Neighbor Query Processing in Spatial Databases

Gao Yunjun Zheng Baihua Chen Gencai Lee Wang-Chien Lee Ken C. K. Li Qing 《Knowledge and Data Engineering, IEEE Transactions on》2009,21(9):1314-1327

Reverse nearest neighbor (RNN) queries have a broad application base such as decision support, profile-based marketing, resource allocation, etc. Previous work on RNN search does not take obstacles into consideration. In the real world, however, there are many physical obstacles (e.g., buildings) and their presence may affect the visibility between objects. In this paper, we introduce a novel variant of RNN queries, namely, visible reverse nearest neighbor (VRNN) search, which considers the impact of obstacles on the visibility of objects. Given a data set P, an obstacle set O, and a query point q in a 2D space, a VRNN query retrieves the points in P that have q as their visible nearest neighbor. We propose an efficient algorithm for VRNN query processing, assuming that P and O are indexed by R-trees. Our techniques do not require any preprocessing and employ half-plane property and visibility check to prune the search space. In addition, we extend our solution to several variations of VRNN queries, including: 1) visible reverse k-nearest neighbor (VRkNN) search, which finds the points in P that have q as one of their k visible nearest neighbors; 2) delta-VRkNN search, which handles VRkNN retrieval with the maximum visible distance delta constraint; and 3) constrained VRkNN (CVRkNN) search, which tackles the VRkNN query with region constraint. Extensive experiments on both real and synthetic data sets have been conducted to demonstrate the efficiency and effectiveness of our proposed algorithms under various experimental settings. 相似文献

5.

Supporting multi-dimensional queries in mobile P2P network

Lidan Shou Xiaolong Zhang Ping Wang Gang Chen Jinxiang Dong 《Information Sciences》2011,181(13):2841-2857

Nowadays, as the mobile services become widely used, there is a strong demand for mobile support in P2P search techniques. In this paper, we introduce a new cost model for searching multi-dimensional data in mobile P2P environment and propose a novel multi-dimensional mobile P2P search framework called MIME. MIME models the physical node layout in a two-dimensional plane and keeps records of the locations of the nodes to construct a proximity-aware P2P overlay. MIME is able to employ two different split schemes for the construction of the overlay. We propose query processing techniques for such P2P overlay. In addition, we employ a novel expanding method for tuning the performance of KNN queries in MIME. We also discuss two adaptive features incorporated into MIME to support mobility: an update algorithm that makes dynamic updates to the overlay, and a cache mechanism that reduces the load of data migration during the updates. The experimental results show that the proposed techniques are effective, and that MIME achieves significant performance improvements in Point, Range, and KNN queries compared to the conventional system. 相似文献

6.

SSW: A Small-World-Based Overlay for Peer-to-Peer Search 总被引：2，自引：0，他引：2

Mei Li Wang-Chien Lee Sivasubramaniam A. Jing Zhao 《Parallel and Distributed Systems, IEEE Transactions on》2008,19(6):735-749

Peer-to-peer (P2P) systems have become a popular platform for sharing and exchanging voluminous information among thousands or even millions of users. The massive amount of information shared in such systems mandates efficient semantic-based search instead of key-based search. The majority of existing proposals can only support simple key-based search rather than semantic-based search. This paper presents the design of an overlay network, namely, semantic small world (SSW), that facilitates efficient semantic-based search in P2P systems. SSW achieves the efficiency based on four ideas: 1) semantic clustering, where peers with similar semantics organize into peer clusters, 2) dimension reduction, where to address the high maintenance overhead associated with capturing high-dimensional data semantics in the overlay, peer clusters are adaptively mapped to a one-dimensional naming space, 3) small world network, where peer clusters form into a one-dimensional small world network, which is search efficient with low maintenance overhead, and 4) efficient search algorithms, where peers perform efficient semantic-based search, including approximate point query and range query in the proposed overlay. Extensive experiments using both synthetic data and real data demonstrate that SSW is superior to the state of the art on various aspects, including scalability, maintenance overhead, adaptivity to distribution of data and locality of interest, resilience to peer failures, load balancing, and efficiency in support of various types of queries on data objects with high dimensions. 相似文献

7.

HAPS: Supporting Effective and Efficient Full-Text P2P Search with Peer Dynamics

下载免费PDF全文

Zu-Jie Ren 《计算机科学技术学报》2010,25(3):482-498

Recently, peer-to-peer (P2P) search technique has become popular in the Web as an alternative to centralized search due to its high scalability and low deployment-cost. However, P2P search systems are known to suffer from the problem of peer dynamics, such as frequent node join/leave and document changes, which cause serious performance degradation. This paper presents the architecture of a P2P search system that supports full-text search in an overlay network with peer dynamics. This architecture, namely HAPS, consists of two layers of peers. The upper layer is a DHT (distributed hash table) network interconnected by some super peers (which we refer to as hubs). Each hub maintains distributed data structures called search directories, which could be used to guide the query and to control the search cost. The bottom layer consists of clusters of ordinary peers (called providers), which can receive queries and return relevant results. Extensive experimental results indicate that HAPS can perform searches effectively and efficiently. In addition, the performance comparison illustrates that HAPS outperforms a flat structured system and a hierarchical unstructured system in the environment with peer dynamics. 相似文献

8.

Skyframe: a framework for skyline query processing in peer-to-peer systems

Shiyuan Wang Quang Hieu Vu Beng Chin Ooi Anthony K. H. Tung Lizhen Xu 《The VLDB Journal The International Journal on Very Large Data Bases》2009,18(1):345-362

This paper looks at the processing of skyline queries on peer-to-peer (P2P) networks. We propose Skyframe, a framework for efficient skyline query processing in P2P systems, which addresses the challenges of quick response time, low network communication cost and query load balancing among peers. Skyframe consists of two querying methods: one is optimized for network communication while the other focuses on query response time. These methods are different in the way in which the query search space is defined. In particular, the first method uses a high dominating point that has a large dominating region to prune the search space to achieve a low cost in network communication. On the other hand, the second method relaxes the search space in order to allow parallel query processing to speed up query response. Skyframe achieves query load balancing by both query load conscious data space splitting/merging during the join/departure of nodes and dynamic load migration. We further show how to apply Skyframe to both the P2P systems supporting multi-dimensional indexing and the P2P systems supporting single-dimensional indexing. Finally, we have conducted extensive experiments on both real and synthetic data sets over two existing P2P systems: CAN (Ratnasamy in A scalable content-addressable network. In: Proceedings of SIGCOMM Conference, pp. 161–172, 2001) and BATON (Jagadish et al. in A balanced tree structure for peer-to-peer networks. In: Proceedings of VLDB Conference, pp. 661–672, 2005) to evaluate the effectiveness and scalability of Skyframe. 相似文献

9.

P2P网络下的KNN查询

刘丹谢文君《计算机应用》2010,30(5):1156-1158

K最近邻(KNN)查询是相似性查询的一种,已有大部分KNN查询算法都是针对集中式计算环境的,因此很容易形成性能瓶颈。P2P这种新的分布式计算技术能够有效克服集中式计算环境中的性能瓶颈问题。提出了一种分组式P2P网络结构下基于iDisdance索引的KNN查询方法,其主要思想是通过分布式簇索引裁剪搜索空间,降低网络通信开销,从而在P2P环境下执行KNN查询。最后通过仿真测试了该方法的有效性以及分组数量与数据分布对查询开销的影响。相似文献

10.

Answering why-not questions on KNN queries

Zhefan ZHONG Xin LIN Liang HE Jing YANG 《Frontiers of Computer Science》2019,13(5):1062

Being decades of study, the usability of database systems have received more attention in recent years. Now it is especially able to explain missing objects in a query result, which is called “why-not” questions, and is the focus of concern. This paper studies the problem of answering whynot questions on KNN queries. In our real life, many users would like to use KNN queries to investigate the surrounding circumstances. Nevertheless, they often feel disappointed when finding the result not including their expected objects. In this paper, we use the query refinement approach to resolve the problem. Given the original KNN query and a set of missing objects as input, our algorithm offer a refined KNN query that includes the missing objects to the user. The experimental results demonstrate the efficiency of our proposed optimizations and algorithms. 相似文献

11.

基于代理节点的无结构对等网查询机制

下载免费PDF全文

赵奇刘皎瑶徐敬东《计算机工程》2007,33(22):134-136,139

在基于洪泛的无结构对等网中,尽管被查询文件的流行度不同,查询消息仍以同样的方式处理,从而产生大量不必要的消息.为了提高查询效率,该文提出一种基于代理节点的查询机制.一个查询消息被源节点转发给多个代理节点,它们连同源节点发起多个小洪泛.源节点通过调整小洪泛的数量控制查询过程.与Gnutella中的洪泛查询相比,新的查询机制在保持相似成功率的同时最多减少56%的带宽消耗,在保持相同命中数目的同时将响应时间缩短15%. 相似文献

12.

Reverse nearest neighbors in large graphs 总被引：3，自引：0，他引：3

Man Lung Yiu Dimitris Papadias Nikos Mamoulis Yufei Tao 《Knowledge and Data Engineering, IEEE Transactions on》2006,18(4):540-553

A reverse nearest neighbor (RNN) query returns the data objects that have a query point as their nearest neighbor (NN). Although such queries have been studied quite extensively in Euclidean spaces, there is no previous work in the context of large graphs. In this paper, we provide a fundamental lemma, which can be used to prune the search space while traversing the graph in search for RNN. Based on it, we develop two RNN methods; an eager algorithm that attempts to prune network nodes as soon as they are visited and a lazy technique that prunes the search space when a data point is discovered. We study retrieval of an arbitrary number k of reverse nearest neighbors, investigate the benefits of materialization, cover several query types, and deal with cases where the queries and the data objects reside on nodes or edges of the graph. The proposed techniques are evaluated in various practical scenarios involving spatial maps, computer networks, and the DBLP coauthorship graph. 相似文献

13.

Reducing network traffic in unstructured P2P systems using Top-k queries 总被引：1，自引：0，他引：1

Reza Akbarinia Esther Pacitti Patrick Valduriez 《Distributed and Parallel Databases》2006,19(2-3):67-86

A major problem of unstructured P2P systems is their heavy network traffic. This is caused mainly by high numbers of query answers, many of which are irrelevant for users. One solution to this problem is to use Top-k queries whereby the user can specify a limited number (k) of the most relevant answers. In this paper, we present FD, a (Fully Distributed) framework for executing Top-k queries in unstructured P2P systems, with the objective of reducing network traffic. FD consists of a family of algorithms that are simple but effective. FD is completely distributed, does not depend on the existence of certain peers, and addresses the volatility of peers during query execution. We validated FD through implementation over a 64-node cluster and simulation using the BRITE topology generator and SimJava. Our performance evaluation shows that FD can achieve major performance gains in terms of communication and response time. Recommended by: Sunil Prabhakar Work partially funded by the ARA Massive Data of the Agence Nationale de la Recherche. 相似文献

14.

P2ST：基于带权搜索树的P2P搜索模型

吴艾刘心松郝尧袁连海《计算机科学》2007,34(8):64-68

针对非结构化P2P系统搜索效率低的问题,提出了一种基于K叉带权搜索树的P2P搜索模型P2ST.模型构建了服务于搜索的k叉带权树,节点按查询命中率大小在树中由上至下排列,命中率大且稳定的节点处于树的上层,搜索时可由此确定消息扩散的方向.采用缓存上层节点、建立搜索结果和发起节点索引、过热资源复制、为叶节点添加远程邻居等方法进一步提高搜索效率和平衡负载.分析和仿真结果表明,提出的模型能大量减少无效消息,具有较高的搜索效率,且维护搜索树的开销较小. 相似文献

15.

Multi-dimensional top-k dominating queries 总被引：1，自引：0，他引：1

Man Lung Yiu Nikos Mamoulis 《The VLDB Journal The International Journal on Very Large Data Bases》2009,18(3):695-718

The top-k dominating query returns k data objects which dominate the highest number of objects in a dataset. This query is an important tool for decision support since it provides data analysts an intuitive way for finding significant objects. In addition, it combines the advantages of top-k and skyline queries without sharing their disadvantages: (i) the output size can be controlled, (ii) no ranking functions need to be specified by users, and (iii) the result is independent of the scales at different dimensions. Despite their importance, top-k dominating queries have not received adequate attention from the research community. This paper is an extensive study on the evaluation of top-k dominating queries. First, we propose a set of algorithms that apply on indexed multi-dimensional data. Second, we investigate query evaluation on data that are not indexed. Finally, we study a relaxed variant of the query which considers dominance in dimensional subspaces. Experiments using synthetic and real datasets demonstrate that our algorithms significantly outperform a previous skyline-based approach. We also illustrate the applicability of this multi-dimensional analysis query by studying the meaningfulness of its results on real data. 相似文献

16.

路网中移动对象快照K近邻查询处理

卢秉亮刘娜《计算机应用》2011,31(11):3078-3083

扩展了一种支持路网中移动对象的位置相关查询框架的功能,利用存在磁盘上的R树来存储网络连通性和一种基于内存的网格结构来维持移动对象的位置更新,提出了基于范围查询(MNDR)的快照K近邻查询算法(SKNN),对空间中的任意一条边,分析可能受影响的最大数量和最小数量的网格单元格,说明用于快照范围查询处理的搜索空间的最大范围,预估包含查询结果的子空间,使用这个子空间作为范围调用MNDR来有效地计算路网中查询点的KNN POI,降低I/O成本,缩短查询时间。通过实验对比,当规模扩展到数十万的移动对象时,SKNN比种有效查询处理空间网络数据的预计算方法S-GRID有更好大的系统吞吐量。相似文献

17.

Efficient search based on content similarity over self-organizing P2P networks

Christos Doulkeridis Akrivi Vlachou Kjetil Nørvåg Yannis Kotidis Michalis Vazirgiannis 《Peer-to-Peer Networking and Applications》2010,3(1):67-79

The advent of the World Wide Web has made an enormous amount of information available to everyone and the widespread use of digital equipment enables end-users (peers) to produce their own digital content. This vast amount of information requires scalable data management systems. Peer-to-peer (P2P) systems have so far been well established in several application areas, with file-sharing being the most prominent. The next challenge that needs to be addressed is (more complex) data sharing, management and query processing, thus facilitating the delivery of a wide spectrum of novel data-centric applications to the end-user, while providing high Quality-of-Service. In this paper, we propose a self-organizing P2P system that is capable to identify peers with similar content and intentionally assign them to the same super-peer. During content retrieval, fewer super-peers need to be contacted and therefore efficient similarity search is supported, in terms of reduced network traffic and contacted peers. Our approach increases the responsiveness and reliability of a P2P system and we demonstrate the advantages of our approach using large-scale simulations. 相似文献

18.

Direction-aware KNN queries for moving objects in a road network

Tianyang Dong Lulu Yuan Qiang Cheng Bin Cao Jing Fan 《World Wide Web》2019,22(4):1765-1797

Recently more and more people focus on k-nearest neighbor (KNN) query processing over moving objects in road networks, e.g., taxi hailing and ride sharing. However, as far as we know, the existing k-nearest neighbor (KNN) queries take distance as the major criteria for nearest neighbor objects, even without taking direction into consideration. The main issue with existing methods is that moving objects change their locations and directions frequently over time, so the information updates cannot be processed in time and they run the risk of retrieving the incorrect KNN results. They may fail to meet users’ needs in certain scenarios, especially in the case of querying k-nearest neighbors for moving objects in a road network. In order to find the top k-nearest objects moving toward a query point, this paper presents a novel algorithm for direction-aware KNN (DAKNN) queries for moving objects in a road network. In this method, R-tree and simple grid are firstly used as the underlying index structure, where the R-tree is used for indexing the static road network and the simple grid is used for indexing the moving objects. Then, it introduces the notion of “azimuth” to represent the moving direction of objects in a road network, and presents a novel local network expansion method to quickly judge the direction of the moving objects. By considering whether a moving object is moving farther away from or getting closer to a query point, the object that is definitely not in the KNN result set is effectively excluded. Thus, we can reduce the communication cost, meanwhile simplify the computation of moving direction between moving objects and query point. Comprehensive experiments are conducted and the results show that our algorithm can achieve real-time and efficient queries in retrieving objects moving toward query point in a road network.

相似文献

19.

Finding superior skyline points for multidimensional recommendation applications

Jing Yang Gabriel Pui Cheong Fung Wei Lu Xiaofang Zhou Hong Chen Xiaoyong Du 《World Wide Web》2012,15(1):33-60

In a typical Web recommendation system, objects are often described by many attributes. It also needs to serve many users with a diversified range of preferences. In other words, it must be capable to efficiently support high dimensional preference queries that allow the user to explore the data space effectively without imposing specific preference weightings for each dimension. The skyline query, which can produce a set of objects guaranteed to contain all top ranked objects for any linear attribute preference combination, has been proposed to support this type of recommendation applications. However, it suffers from the problem known as ‘dimensionality curse’ as the size of skyline query result set can grow exponentially with the number of dimensions. Therefore, when the dimensionality is high, a large percentage of objects can become skyline points. This problem makes such a recommendation system less usable for users. In this paper, we propose a stronger type of skyline query, called core skyline query, that adopts a new quality measure called vertical dominance to return only an interesting subset of the traditional skyline points. An efficient query processing method is proposed to find core skyline points using a novel indexing structure called Linked Multiple B’-trees (LMB). Our approach can find such superior skyline points progressively without the need of computing the entire set of skyline points first. 相似文献

20.

支持语义的P2P搜索研究 总被引：6，自引：0，他引：6

王志晓张大陆刘雷姚传茂《计算机工程与应用》2007,43(3):8-11

传统的P2P系统基于单特征词搜索,且不支持语义,有一定的局限性。向量空间模型VSM技术的应用解决了P2P系统中多特征词搜索的问题;标识符空间的分割,使相似文档在邻近的节点范围内聚集,提高了搜索的速度;语义思想的应用,使P2P系统能够理解搜索请求,有利于检索性能,特别是查全率的提高。仿真实验的结果表明:实现了多特征词的搜索;搜索收敛的速度较快;支持语义,检索性能得到了提高;节点达到了较好的负载平衡。相似文献