首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
移动对象连续k近邻(CKNN)查询是指给定一个连续移动的对象集合,对于任意一个k近邻查询q,实时计算查询qk近邻并在查询有效时间内对查询结果进行实时更新.现实生活中,交通出行、社交网络、电子商务等领域许多基于位置的应用服务都涉及移动对象连续k近邻查询这一基础问题.已有研究工作解决连续k近邻查询问题时,大多需要通过多次迭代确定一个包含k近邻的查询范围,而每次迭代需要根据移动对象的位置计算当前查询范围内移动对象的数量,整个迭代过程的计算代价占查询代价的很大部分.为此,提出了一种基于网络索引和混合高斯函数移动对象分布密度的双重索引结构(grid GMM index,GGI),并设计了移动对象连续k近邻增量查询算法(incremental search for continuous k nearest neighbors,IS-CKNN).GGI索引结构的底层采用网格索引对海量移动对象进行维护,上层构建混合高斯模型模拟移动对象在二维空间中的分布.对于给定的k近邻查询q,IS-CKNN算法能够基于混合高斯模型直接确定一个包含qk近邻的查询区域,减少了已有算法求解该区域的多次迭代过程;当移动对象和查询q位置发生变化时,进一步提出一种高效的增量查询策略,能够最大限度地利用已有查询结果减少当前查询的计算量.最后,在滴滴成都网约车数据集以及两个模拟数据集上进行大量实验,充分验证了算法的性能.  相似文献   

2.
With the increasing popularity of the peer-to-peer (P2P) computing paradigm, many general range query schemes for distributed hash table (DHT)-based P2P systems have been proposed in recent years. Although those schemes can provide range query capability without modifying the underlying DHTs, they have the query delay depending on both the scale of the system and the size of the query space or the specific query, and thus cannot guarantee to return the query results in a bounded delay. In this paper, we propose Armada, an efficient range query processing scheme to support delay-bounded single-attribute and multiple-attribute range queries. It is the first delay-bounded general range query scheme on constant-degree DHTs, and can return the results for any range query within 2logN hops in a P2P system with N peers. Results of analysis and simulations show that the average delay in Armada is less than logN, and the average message cost of single-attribute range queries is about logN+2n 2 (n is the number of peers that intersect with the query). These results are very close to the lower bounds on delay and message cost of range queries over constant-degree DHTs.  相似文献   

3.
基于分布式哈希表(DHT)的结构化P2P网络具有扩展性好、健壮和自组织等优点,但只支持精确匹配的查询.本文提出一种基于分布式范围树的结构化P2P范围查询方法(DRT-RQ),该方法将多维索引的分布式范围树分发到已有的结构化DHT覆盖网络中,利用DHT系统提供的数据查找接口,有效实现数据对象的范围查询.实验结果表明,基于分布式范围树的范围查询(DRT-RQ)比基于前缀哈希树的范围查询(PHT-RQ)需要更短的查询延时.  相似文献   

4.
MAAN: A Multi-Attribute Addressable Network for Grid Information Services   总被引:14,自引:0,他引:14  
Recent structured Peer-to-Peer (P2P) systems such as Distributed Hash Tables (DHTs) offer scalable key-based lookup for distributed resources. However, they cannot be simply applied to grid information services because grid resources need to be registered and searched using multiple attributes. This paper proposes a Multi-Attribute Addressable Network (MAAN) that extends Chord to support multi-attribute and range queries. MAAN addresses range queries by mapping attribute values to the Chord identifier space via uniform locality preserving hashing. It uses an iterative or single attribute dominated query routing algorithm to resolve multi-attribute based queries. Each node in MAAN only has O(logN) neighbors for N nodes. The number of routing hops to resolve a multi-attribute range query is O(logN+N×smin), where smin is the minimum range selectivity on all attributes. When smin=, it is logarithmic to the number of nodes, which is scalable to a large number of nodes and attributes. We also measured the performance of our MAAN implementation and the experimental results are consistent with our theoretical analysis.  相似文献   

5.
Continuous query processing in data stream management systems (DSMS) has received considerable attention recently. Many applications share the same need for processing data streams in a continuous fashion. For most distributed streaming applications, the centralized processing of continuous queries over distributed data is simply not viable. This paper addresses the problem of computing approximate answers to continuous join queries over distributed data streams. We present a new method, called DHTJoin, which combines hash-based placement of tuples in a Distributed Hash Table (DHT) and dissemination of queries by exploiting the embedded trees in the underlying DHT, thereby incurring little overhead. DHTJoin also deals with join attribute value skew which may hurt load balancing and result completeness. We provide a performance evaluation of DHTJoin which shows that it can achieve significant performance gains in terms of network traffic.  相似文献   

6.
在云资源共享服务模式中,针对云资源多属性范围查询的问题,提出一种改进的E-SkipNet网络。首先,E-SkipNet在传统分布式哈希表(DHT)网络SkipNet的基础上将数据属性引入到节点NameID的设置中,将物理节点加入到单个属性域中,以支持多属性范围查询;其次,在原E-SkipNet网络的基础上,将物理节点同时映射成多个逻辑节点;同时加入多个属性域,并将资源按照不同的属性发布到不同逻辑节点上;最后,采用均匀位置保留哈希函数对资源进行映射存储,从而在各个属性域中保留属性值的顺序关系,从而支持范围查询。仿真结果表明,改进后的E-SkipNet网络与改进前的E-SkipNet和多属性可寻址网络(MAAN)相比,在路由效率方面分别提高了18.09%和20.47%。结果表明,改进后的E-SkipNet网络能支持更加高效的云资源多属性范围查询,在异构环境中能较好地实现负载均衡。  相似文献   

7.
Due to the wide-spread use of geo-positioning technologies and geo-social networks, the reverse top-k geo-social keyword query has attracted considerable attention from both industry and research communities. A reverse top-k geo- social keyword (RkGSK) query finds the users who are spatially near, textually similar, and socially relevant to a specified point of interest. RkGSK queries are useful in many real-life applications. For example, they can help the query issuer identify potential customers in marketing decisions. However, the query constraints could be too strict sometimes, making it hard to find any result for the RkGSK query. The query issuers may wonder how to modify their original queries to get a certain number of query results. In this paper, we study non-answer questions on reverse top-k geo-social keyword queries (NARGSK). Given an RkGSK query and the required number M of query results, NARGSK aim to find the refined RkGSK query having M users in its result set. To efficiently answer NARGSK, we propose two algorithms (ERQ and NRG) based on query relaxation. As this is the first work to address NARGSK to the best of our knowledge, ERQ is the baseline extended from the state-of-the-art method, while NRG further improves the efficiency of ERQ. Extensive experiments using real-life datasets demonstrate the efficiency of our proposed algorithms, and the performance of NRG is improved by a factor of 1–2 on average compared with ERQ.  相似文献   

8.
The in–network aggregation paradigm in sensor networks provides a versatile approach for evaluating aggregate queries. Traditional approaches need a separate aggregate to be computed and communicated for each query and hence do not scale well with the number of queries. Since approximate query results are sufficient for many applications, we use an alternate approach based on summary data–structures. We consider two kinds of aggregate queries: location range queries that compute the sum of values reported by sensors in a given location range, and value range queries that compute the number of sensors that report values in a given range. We construct summary data–structures called linear sketches, over the sensor data using in–network aggregation and use them to answer aggregate queries in an approximate manner at the base–station. There is a trade–off between accuracy of the query results and lifetime of the sensor network that can be exploited to achieve increased lifetimes for a small loss in accuracy. Most commonly occurring sets of range queries are highly correlated and display rich algebraic structure. Our approach takes full advantage of this by constructing linear sketches that depend on queries. Experimental results show that linear sketching achieves significant improvements in lifetime of sensor networks for only a small loss in accuracy of the queries. Further, our approach achieves more accurate query results than the other classical techniques using Discrete Fourier Transform and Discrete Wavelet Transform. This work was supported in part by NASA under Cooperative Agreement NCC5–315.  相似文献   

9.
Large highly distributed data sets are poorly supported by current query technologies. Applications such as endsystem-based network management are characterized by data stored on large numbers of endsystems, with frequent local updates and relatively infrequent global one-shot queries. The challenges are scale (103 to 109 endsystems) and endsystem unavailability. In such large systems, a significant fraction of endsystems and their data will be unavailable at any given time. Existing methods to provide high data availability despite endsystem unavailability involve centralizing, redistributing or replicating the data. At large scale these methods are not scalable. We advocate a design that trades query delay for completeness, incrementally returning results as endsystems become available. We also introduce the idea of completeness prediction, which provides the user with explicit feedback about this delay/completeness trade-off. Completeness prediction is based on replication of compact data summaries and availability models. This metadata is orders of magnitude smaller than the data. Seaweed is a scalable query infrastructure supporting incremental results, online in-network aggregation and completeness prediction. It is built on a distributed hash table (DHT) but unlike previous DHT based approaches it does not redistribute data across the network. It exploits the DHT infrastructure for failure-resilient metadata replication, query dissemination, and result aggregation. We analytically compare Seaweed’s scalability against other approaches and also evaluate the Seaweed prototype running on a large-scale network simulator driven by real-world traces.  相似文献   

10.
Recently research has deeply investigated the problem of querying semi-structured data and data which can be represented by means of graphs (e.g. object-oriented data, XML data, etc.). Typically queries on graph-like data, called path queries, are expressed by means of regular expressions denoting paths in the graph. The result of a path query is the set of nodes reachable by means of a path expressed by a specified regular expression. In this paper we investigate the problem of extracting a subgraph satisfying a given property from a given graph representing some information. We propose a new form of queries, called graph queries, whose answers are (marked) graphs having a particular structure, extracted from the source graph. We show that a simple form of graph grammars can be profitably used to define graph queries. The result of a graph query, using a grammar G over a database D, is a marked subgraph of D ‘matching’ a graph derived from G. We consider different types of graph grammars which can be used to query graph-like data and consider their expressiveness and complexity.  相似文献   

11.
This work introduces decentralized query processing techniques based on MIDAS, a novel distributed multidimensional index. In particular, MIDAS implements a distributed k-d tree, where leaves correspond to peers, and internal nodes dictate message routing. MIDAS requires that peers maintain little network information, and features mechanisms that support fault tolerance and load balancing. The proposed algorithms process point and range queries over the multidimensional indexed space in only O(log n) hops in expectance, where n is the network size. For nearest neighbor queries, two processing alternatives are discussed. The first, termed eager processing, has low latency (expected value of O(log n) hops) but may involve a large number of peers. The second, termed iterative processing, has higher latency (expected value of O(log2 n) hops) but involves far fewer peers. A detailed experimental evaluation demonstrates that our query processing techniques outperform existing methods for settings involving real spatial data as well as in the case of high dimensional synthetic data.  相似文献   

12.
In recent years, many layered indexing techniques over distributed hash table (DHT)-based peer-to-peer (P2P) systems have been proposed to realize distributed range search. In this paper, we present a fault tolerant constant degree dynamic Distributed Spatial Data Structure called DSDS that supports orthogonal range search on a set of N d-dimensional points published on n nodes. We describe a total order binary relation algorithm to publish points among supernodes and determine supernode keys. A non-redundant rainbow skip graph is used to coordinate message passing among nodes. The worst case orthogonal range search cost in a d-dimensional DSDS with n nodes is \(O\left (\log n+m+\frac {K}{B}\right )\) messages, where m is the number of nodes intersecting the query, K is the number of points reported in range, and B is the number of points that can fit in one message. A complete backup copy of data points stored in other nodes provides redundancy for our DSDS. This redundancy permits answering a range search query in the case of failure of a single node. For single node failure, the DSDS routing system can be recovered to a fully functional state at a cost of O(log n) messages. Backup sets in DSDS nodes are used to first process a query in the most efficient dimension, and then used to process a query containing the data in a failed node in d-dimensional space. The DSDS search algorithm can process queries in d-dimensional space and still tolerate failure of one node. Search cost in the worst case with a failed node increases to \(O\left (d\log n+dm+\frac {K}{B}\right )\) messages for d dimensions.  相似文献   

13.
We present new results on Cartesian trees with applications in range minimum queries and bottleneck edge queries. We introduce a cache-oblivious Cartesian tree for solving the range minimum query problem, a Cartesian tree for the bottleneck edge query problem on trees and undirected graphs, and a proof that no Cartesian tree exists for the two-dimensional version of the range minimum query problem.  相似文献   

14.
Spatiotemporal objects – that is, objects that evolve over time – appear in many applications. Due to the nature of such applications, storing the evolution of objects through time in order to answer historical queries (queries that refer to past states of the evolution) requires a very large specialized database, what is termed in this article a spatiotemporal archive. Efficient processing of historical queries on spatiotemporal archives requires equally sophisticated indexing schemes. Typical spatiotemporal indexing techniques represent the objects using minimum bounding regions (MBR) extended with a temporal dimension, which are then indexed using traditional multidimensional index structures. However, rough MBR approximations introduce excessive overlap between index nodes, which deteriorates query performance. This article introduces a robust indexing scheme for answering spatiotemporal queries more efficiently. A number of algorithms and heuristics are elaborated that can be used to preprocess a spatiotemporal archive in order to produce finer object approximations, which, in combination with a multiversion index structure, will greatly improve query performance in comparison to the straightforward approaches. The proposed techniques introduce a query efficiency vs. space tradeoff that can help tune a structure according to available resources. Empirical observations for estimating the necessary amount of additional storage space required for improving query performance by a given factor are also provided. Moreover, heuristics for applying the proposed ideas in an online setting are discussed. Finally, a thorough experimental evaluation is conducted to show the merits of the proposed techniques. Edited by B. Seeger A short version of this article appeared as “Efficient indexing of spatiotemporal objects” in the Proceedings of Extending Database Technology 2002 [19]. This work was partially supported by NSF grants IIS-9907477, EIA-9983445, NSF IIS 9984729, NSF ITR 0220148, NSF IIS-0133825, NRDRP, and the U.S. Department of Defense.  相似文献   

15.
We consider the problem of indexing a set of objects moving in d-dimensional spaces along linear trajectories. A simple external-memory indexing scheme is proposed to efficiently answer general range queries. The following are examples of the queries that can be answered by the proposed method: report all moving objects that will (i) pass between two given points within a specified time interval; (ii) become within a given distance from some or all of a given set of other moving objects. Our scheme is based on mapping the objects to a dual space, where queries about moving objects are transformed into polyhedral queries concerning their speeds and initial locations. We then present a simple method for answering such polyhedral queries, based on partitioning the space into disjoint regions and using a B+-tree to index the points in each region. By appropriately selecting the boundaries of each region, we guarantee an average search time that matches a known lower bound for the problem. Specifically, for a fixed d, if the coordinates of a given set of N points are statistically independent, the proposed technique answers polyhedral queries, on the average, in O((N/B)1−1/d⋅(log B N)1/d+K/B) I/O's using O(N/B) space, where B is the block size, and K is the number of reported points. Our approach is novel in that, while it provides a theoretical upper bound on the average query time, it avoids the use of complicated data structures, making it an effective candidate for practical applications. The proposed index is also dynamic in the sense that it allows object insertion and deletion in an amortized update cost of log B(N) I/O's. Experimental results are presented to show the superiority of the proposed index over other methods based on R-trees. recommend Ahmed Elmagarmid  相似文献   

16.
A?data structure, called a biased range tree, is presented that preprocesses a set S of n points in ?2 and a query distribution D for 2-sided orthogonal range counting queries (a.k.a. dominance counting queries). The expected query time for this data structure, when queries are drawn according to?D, matches, to within a constant factor, that of the optimal comparison tree for S and D. The memory and preprocessing requirements of the data structure are? O(nlog?n).  相似文献   

17.
XMin: Minimizing Tree Pattern Queries with Minimality Guarantee   总被引:1,自引:0,他引:1  
Due to wide use of XPath, the problem of efficiently processing XPath queries has recently received a lot of attention. In particular, a considerable effort has been devoted to minimizing XPath queries since the efficiency of query processing greatly depends on the size of the query. Research work in this area can be classified into two categories: constraint-independent minimization and constraint-dependent minimization. The former minimizes queries in the absence of integrity constraints while the latter in the presence of them. For a linear path query, which is an XPath query without branching predicates, existing constraint-independent minimization methods are generally known to be unable to minimize the query without processing the query itself. Most recently, however, by using the DataGuide, a representative structural summary of XML data, a constraint-independent method that minimizes linear path queries in a top-down fashion has been proposed. Nevertheless, this method can fail to find a minimal query since it minimizes a query by merely erasing labels from the original query whereas a minimal query could include labels that are not present in the original query. In this paper, we propose a bottom-up approach called XMin that guarantees finding a minimal query for a given tree pattern query by using the DataGuide without processing the query itself. For the linear path query, we first show that the sequence of labels occurring in the minimal query is a subsequence of every schema label sequence that matches the original query. Here, the schema label sequence for a node is the sequence of labels from the root of XML data to the node. We then propose iterative subsequence generation that iteratively generates subsequences from the shortest schema label sequence matching the original query in a bottom-up fashion and tests query equivalence. Using iterative subsequence generation, we can always find a minimal query and we formally prove this guarantee. We also propose an extended algorithm that guarantees the minimality for the tree pattern query, which is a linear path query with branching predicates. These methods have been prototyped in a full-fledged object-relational DBMS. The experimental results using real and synthetic data sets show the practicality of our method.  相似文献   

18.
张凡  熊志平  胡运发 《计算机工程》2006,32(10):66-67,70
树模式是查询树型结构数据如XML和LDAP的天然模型。在一个给定的数据库上进行查询,查询的效率很大程度上依赖于查询的大小。因此,在查询前删除查询中的冗余分支,使查询最小化是非常重要的。在树型结构数据库中,存在孩子必需、后代必需和子类3种完整性约束是十分普遍的。针对存在这3种完整性约束的情况,基于扩展的模拟概念提出了一种复杂度为O(n^2)的最小化树模式查询算法(n为树模式查询的节点数)。分析结果表明这个算法的效率要远高于同类算法。  相似文献   

19.
In a typical Web recommendation system, objects are often described by many attributes. It also needs to serve many users with a diversified range of preferences. In other words, it must be capable to efficiently support high dimensional preference queries that allow the user to explore the data space effectively without imposing specific preference weightings for each dimension. The skyline query, which can produce a set of objects guaranteed to contain all top ranked objects for any linear attribute preference combination, has been proposed to support this type of recommendation applications. However, it suffers from the problem known as ‘dimensionality curse’ as the size of skyline query result set can grow exponentially with the number of dimensions. Therefore, when the dimensionality is high, a large percentage of objects can become skyline points. This problem makes such a recommendation system less usable for users. In this paper, we propose a stronger type of skyline query, called core skyline query, that adopts a new quality measure called vertical dominance to return only an interesting subset of the traditional skyline points. An efficient query processing method is proposed to find core skyline points using a novel indexing structure called Linked Multiple B’-trees (LMB). Our approach can find such superior skyline points progressively without the need of computing the entire set of skyline points first.  相似文献   

20.
Set queries are an important topic and have attracted a lot of attention. Earlier research mainly concentrated on set containment queries. In this paper we focus on the T-Overlap query which is the foundation of the set similarity query. To address this issue, unlike traditional algorithms that are based on an inverted index, we design a new paradigm based on the prefix tree (trie) called the expanded trie index (ETI) which expands the trie node structure by adding some new properties. Based on ETI, we convert the TOverlap problem to finding query nodes with specific query depth equaling to T and propose a new algorithm called TSimilarity to solve T-Overlap efficiently. Then we carry out a three-step framework to extend T-Overlap to other similarity predicates. Extensive experiments are carried out to compare T-Similarity with other inverted index based algorithms from cardinality of query, overlap threshold, dataset size, the number of distinct elements and so on. Results show that T-Similarity outperforms the state-of-the-art algorithms in many aspects.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号