首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Efficient indexing on a class hierarchy is essential for the achievement of high performance in query evaluation for object databases. In this paper, we present a practical indexing scheme, Partition Index Configuration Scheme (PINS), which provides good index configurations for any real database environment. PINS considers the distribution of key values, as well as query patterns such as query frequency on each class. PINS can easily be applied to any database system, since it uses the B+-tree structure. We develop a cost model and, through experiments, demonstrate the performance of the proposed policy over various class hierarchies.  相似文献   

2.
This paper develops a novel, compressed B+-tree based indexing scheme that supports the processing of moving objects in one-, two-, and multi- dimensional spaces. The past, current, and anticipated future trajectories of movements are fully indexed and well organized. No parameterized functions and geometric representations are introduced in our data model so that update operations are not required and the maintenance of index structures can be accomplished by basic insertion and deletion operations. The proposed method has two contributions. First, the spatial and temporal attributes of trajectories are accurately preserved and well organized into compact index structures with very efficient memory space utilization and storage requirement. Second, index maintenance overheads are more economical and query performance is more responsive than those of conventional methods. Both analytical and empirical studies show that our proposed indexing scheme outperforms the TPR-tree.  相似文献   

3.
Spatiotemporal objects – that is, objects that evolve over time – appear in many applications. Due to the nature of such applications, storing the evolution of objects through time in order to answer historical queries (queries that refer to past states of the evolution) requires a very large specialized database, what is termed in this article a spatiotemporal archive. Efficient processing of historical queries on spatiotemporal archives requires equally sophisticated indexing schemes. Typical spatiotemporal indexing techniques represent the objects using minimum bounding regions (MBR) extended with a temporal dimension, which are then indexed using traditional multidimensional index structures. However, rough MBR approximations introduce excessive overlap between index nodes, which deteriorates query performance. This article introduces a robust indexing scheme for answering spatiotemporal queries more efficiently. A number of algorithms and heuristics are elaborated that can be used to preprocess a spatiotemporal archive in order to produce finer object approximations, which, in combination with a multiversion index structure, will greatly improve query performance in comparison to the straightforward approaches. The proposed techniques introduce a query efficiency vs. space tradeoff that can help tune a structure according to available resources. Empirical observations for estimating the necessary amount of additional storage space required for improving query performance by a given factor are also provided. Moreover, heuristics for applying the proposed ideas in an online setting are discussed. Finally, a thorough experimental evaluation is conducted to show the merits of the proposed techniques. Edited by B. Seeger A short version of this article appeared as “Efficient indexing of spatiotemporal objects” in the Proceedings of Extending Database Technology 2002 [19]. This work was partially supported by NSF grants IIS-9907477, EIA-9983445, NSF IIS 9984729, NSF ITR 0220148, NSF IIS-0133825, NRDRP, and the U.S. Department of Defense.  相似文献   

4.
This paper studies the problem of processing supergraph queries, that is, given a database containing a set of graphs, find all the graphs in the database of which the query graph is a supergraph. Existing works usually construct an index and performs a filtering-and-verification process, which still requires many subgraph isomorphism testings. There are also significant overheads in both index construction and maintenance. In this paper, we design a graph querying system that achieves both fast indexing and efficient query processing. The index is constructed by a simple but fast method of extracting the commonality among the graphs, which does not involve any costly operation such as graph mining. Our query processing has two key techniques, direct inclusion and filtering. Direct inclusion allows partial query answers to be included directly without candidate verification. Our filtering technique further reduces the candidate set by operating on a much smaller projected database. Experimental results show that our method is significantly more efficient than the existing works in both indexing and query processing, and our index has a low maintenance cost.  相似文献   

5.
Content based image retrieval is an active area of research. Many approaches have been proposed to retrieve images based on matching of some features derived from the image content. Color is an important feature of image content. The problem with many traditional matching-based retrieval methods is that the search time for retrieving similar images for a given query image increases linearly with the size of the image database. We present an efficient color indexing scheme for similarity-based retrieval which has a search time that increases logarithmically with the database size.In our approach, the color features are extracted automatically using a color clustering algorithm. Then the cluster centroids are used as representatives of the images in 3-dimensional color space and are indexed using a spatial indexing method that usesR-tree. The worst case search time complexity of this approach isOn q log(N* navg)), whereN is the number of images in the database, andn q andn avg are the number of colors in the query image and the average number of colors per image in the database respectively. We present the experimental results for the proposed approach on two databases consisting of 337 Trademark images and 200 Flag images.  相似文献   

6.
The database community has devoted extensive amount of efforts to indexing and querying temporal data in the past decades. However, insufficient amount of attention has been paid to temporal ranking queries. More precisely, given any time instance t, the query asks for the top-k objects at time t with respect to some score attribute. Some generic indexing structures based on R-trees do support ranking queries on temporal data, but as they are not tailored for such queries, the performance is far from satisfactory. We present the Seb-tree, a simple indexing scheme that supports temporal ranking queries much more efficiently. The Seb-tree answers a top-k query for any time instance t in the optimal number of I/Os in expectation, namely, O(logB \fracNB+\frackB){O\left({\rm log}_B\,\frac{N}{B}+\frac{k}{B}\right)} I/Os, where N is the size of the data set and B is the disk block size. The index has near-linear size (for constant and reasonable k max values, where k max is the maximum value for the possible values of the query parameter k), can be constructed in near-linear time, and also supports insertions and deletions without affecting its query performance guarantee. Most of all, the Seb-tree is especially appealing in practice due to its simplicity as it uses the B-tree as the only building block. Extensive experiments on a number of large data sets, show that the Seb-tree is more than an order of magnitude faster than the R-tree based indexes for temporal ranking queries.  相似文献   

7.
Temporal XML: modeling, indexing, and query processing   总被引:1,自引:0,他引:1  
In this paper we address the problem of modeling and implementing temporal data in XML. We propose a data model for tracking historical information in an XML document and for recovering the state of the document as of any given time. We study the temporal constraints imposed by the data model, and present algorithms for validating a temporal XML document against these constraints, along with methods for fixing inconsistent documents. In addition, we discuss different ways of mapping the abstract representation into a temporal XML document, and introduce TXPath, a temporal XML query language that extends XPath 2.0. In the second part of the paper, we present our approach for summarizing and indexing temporal XML documents. In particular we show that by indexing continuous paths, i.e., paths that are valid continuously during a certain interval in a temporal XML graph, we can dramatically increase query performance. To achieve this, we introduce a new class of summaries, denoted TSummary, that adds the time dimension to the well-known path summarization schemes. Within this framework, we present two new summaries: LCP and Interval summaries. The indexing scheme, denoted TempIndex, integrates these summaries with additional data structures. We give a query processing strategy based on TempIndex and a type of ancestor-descendant encoding, denoted temporal interval encoding. We present a persistent implementation of TempIndex, and a comparison against a system based on a non-temporal path index, and one based on DOM. Finally, we sketch a language for updates, and show that the cost of updating the index is compatible with real-world requirements.  相似文献   

8.
Yun  Tae-Seob  Whang  Kyu-Young  Kwon  Hyuk-Yoon  Kim  Jun-Sung  Song  Il-Yeol 《World Wide Web》2019,22(6):2437-2467

We propose two-dimensional indexing—a novel in-memory indexing architecture that operates over distributed memory of a massively-parallel search engine. The goal of two-dimensional indexing is to provide a one-integrated-memory view as in a single node system using one large integrated memory. In two-dimensional indexing, we partition the entire index into n× m fragments and distribute them over the memories of multiple nodes in such a way that each fragment is entirely stored in main memory of one node. The proposed architecture is not only scalable as it uses a scaled-out shared-nothing architecture but also is capable of achieving low query response time as it processes queries in main memory. We also propose the concept of the one-memory point, which is the amount of the memory space required to completely store the entire index in main memory providing a one-integrated-memory view. We first prove the effectiveness of two-dimensional indexing with single-keyword queries, and then, extend the notion so as to be able to handle multiple-keyword queries. To handle multiple-keyword queries, we adopt pre-join that materializes a multiple-keyword query a priori as well as a new notion of semi-memory join that obviates extensive communication overhead to perform join across multiple nodes. In experiments using the real-life search query set over a database consisting of 100 million Web documents crawled, we show that two-dimensional indexing can effectively provide a one-integrated-memory view without too much of additional memory compared with the single node system using one large integrated memory. We also show that, with a six-node prototype, in an ideal case, it significantly improves the query processing performance over a disk-based search engine with an equivalent amount of in-memory buffer but without two-dimensional indexing — by up to 535.54 times. This improvement is expected to get larger as the system is scaled-out with a larger number of machines.

  相似文献   

9.
An efficient peer-to-peer indexing tree structure for multidimensional data   总被引:4,自引:1,他引:3  
As one of the most important technologies for implementing large-scale distributed systems, peer-to-peer (P2P) computing has attracted much attention in both research and industrial communities, for its advantages such as high availability, high performance, and high flexibility to the dynamics of networks. However, multidimensional data indexing remains as a big challenge to P2P computing, because of the inefficiency in search and network maintenance caused by the complicated existing index structures, which greatly limits the scalability of applications and dimensionality of the data to be indexed.We propose SDI (Swift tree structure for multidimensional Data Indexing), a swift index scheme with a simple tree structure for multidimensional data indexing in large-scale distributed systems. While keeping the query efficiency in O(logN) in terms of routing hops, SDI has extremely low maintenance costs which is proved through theoretical analysis. Furthermore, SDI overcomes the root-bottleneck problem existing in most other tree-based distributed indexing systems. Extensive empirical study verifies the superiority of SDI in both query and maintenance performance.  相似文献   

10.
针对现有基于日志结构合并树(LSM-Tree)实现的分布式数据库仅支持高效的主键查询,无法让用户快速地应用在自己的集群中的问题,提出了基于LSM-Tree的轻量级分布式索引实现方法SIBL。首先,通过对主键属性列建立索引来提高非主键属性的查询效率;然后,提出了分布式索引构建算法以及基于等距取样的索引区间划分算法,从而保证了索引在系统中的均匀分布,并且优化了传统索引的查询算法,将索引文件看作特殊的数据文件分布式地存储在系统中,从而保证了系统的负载均衡和可扩展性;最后,将该方法与华为二级索引方案HIndex在HBase数据库上进行实验来比较二者的索引构建的时间和空间开销、索引的查询性能和系统的负载均衡等性能,验证得出所提出的方法使查询性能提升了50~200倍。  相似文献   

11.
With the exponential growth of moving objects data to the Gigabyte range, it has become critical to develop effective techniques for indexing, updating, and querying these massive data sets. To meet the high update rate as well as low query response time requirements of moving object applications, this paper takes a novel approach in moving object indexing. In our approach, we do not require a sophisticated index structure that needs to be adjusted for each incoming update. Rather, we construct conceptually simple short-lived index images that we only keep for a very short period of time (sub-seconds) in main memory. As a consequence, the resulting technique MOVIES supports at the same time high query rates and high update rates, trading this property for query result staleness. Moreover, MOVIES is the first main memory method supporting time-parameterized predictive queries. To support this feature, we present two algorithms: non-predictive MOVIES and predictive MOVIES. We obtain the surprising result that a predictive indexing approach—considered state-of-the-art in an external-memory scenario—does not scale well in a main memory environment. In fact, our results show that MOVIES outperforms state-of-the-art moving object indexes such as a main-memory adapted B x -tree by orders of magnitude w.r.t. update rates and query rates. In our experimental evaluation, we index the complete road network of Germany consisting of 40,000,000 road segments and 38,000,000 nodes. We scale our workload up to 100,000,000 moving objects, 58,000,000 updates per second and 10,000 queries per second, a scenario at a scale unmatched by any previous work.  相似文献   

12.
倪巍伟  李灵奇  刘家强 《软件学报》2019,30(12):3782-3797
针对已有的保护位置隐私路网k近邻查询依赖可信匿名服务器造成的安全隐患,以及服务器端全局路网索引利用效率低的缺陷,提出基于路网局部索引机制的保护位置隐私路网近邻查询方法.查询客户端通过与LBS服务器的一轮通信获取局部路网信息,生成查询位置所在路段满足l-路段多样性的匿名查询序列,并将匿名查询序列提交LBS服务器,从而避免保护位置隐私查询对可信第三方服务器的依赖.在LBS服务器端,提出基于路网基本单元划分的分段式近邻查询处理策略,对频繁查询请求路网基本单元,构建基于路网泰森多边形和R*树的局部Vor-R*索引结构,实现基于索引的快速查找.对非频繁请求路网基本单元,采用常规路网扩张查询处理.有效降低索引存储规模和基于全局索引进行无差异近邻查询的访问代价,在保证查询结果正确的同时,提高了LBS服务器端k近邻查询处理效率.理论分析和实验结果表明,所提方法在兼顾查询准确性的同时,有效地提高了查询处理效率.  相似文献   

13.
With ever growing databases containing multimedia data, indexing has become a necessity to avoid a linear search. We propose a novel technique for indexing multimedia databases in which entries can be represented as graph structures. In our method, the topological structure of a graph as well as that of its subgraphs are represented as vectors whose components correspond to the sorted laplacian eigenvalues of the graph or subgraphs. Given the laplacian spectrum of graph G, we draw from recently developed techniques in the field of spectral integral variation to generate the laplacian spectrum of graph G+e without computing its eigendecomposition, where G+e is a graph obtained by adding edge e to graph G. This process improves the performance of the system for generating the subgraph signatures for 1.8% and 6.5% in datasets of size 420 and 1400, respectively. By doing a nearest neighbor search around the query spectra, similar but not necessarily isomorphic graphs are retrieved. Given a query graph, a voting schema ranks database graphs into an indexing hypothesis to which a final matching process can be applied. The novelties of the proposed method come from the powerful representation of the graph topology and successfully adopting the concept of spectral integral variation in an indexing algorithm. To examine the fitness of the new indexing framework, we have performed a number of experiments using an extensive set of recognition trials in the domain of 2D and 3D object recognition. The experiments, including a comparison with a competing indexing method using two different graph-based object representations, demonstrate both the robustness and efficacy of the overall approach.  相似文献   

14.
Peer-to-Peer (P2P) computing has recently attracted a great deal of research attention. In a P2P system, a large number of nodes can potentially be pooled together to share their resources, information, and services. However, existing unstructured P2P systems lack support for content-based search over data objects which are generally represented by high-dimensional feature vectors. In this paper, we propose an efficient and effective indexing mechanism to facilitate high-dimensional similarity query in unstructured P2P systems, named Linking Identical Neighborly Partitions (LINP), which combines both space partitioning technique and routing index technique. With the aid of LINP, each peer can not only process similarity query efficiently over its local data, but also can route the query to the promising peers which may contain the desired data. In the proposed scheme, each peer summarizes its local data using the space partitioning technique, and exchanges the summarized index with its neighboring peers to construct routing indices. Furthermore, to improve the system performance with peer updates, we propose an extension of the LINP, named LINP+, where each peer can reconfigure its neighboring peers to keep relevant peers nearby. The performance of our proposed scheme is evaluated over both synthetic and real-life high-dimensional datasets, and experimental results show the superiority of our proposed scheme.  相似文献   

15.
In the last decade, spatio-temporal database research focuses on the design of effective and efficient indexing structures in support of location-based queries such as predictive range queries and nearest neighbor queries. While a variety of indexing techniques have been proposed to accelerate the processing of updates and queries, not much attention has been paid to the updating protocol, which is another important factor affecting the system performance. In this paper, we propose a generic and adaptive updating protocol for moving object databases with less number of updates between objects and the database server, thereby reducing the overall workload of the system. In contrast to the approach adopted by most conventional moving object database systems where the exact locations and velocities last disclosed are used to predict their motions, we propose the concept of Spatio-temporal safe region to approximate possible future locations. Spatio-temporal safe regions provide larger space of tolerance for moving objects, freeing them from location and velocity updates as long as the errors remain predictable in the database. To answer predictive queries accurately, the server is allowed to probe the latest status of objects when their safe regions are inadequate in returning the exact query results. Spatio-temporal safe regions are calculated and optimized by the database server with two contradictory objectives: reducing update workload while guaranteeing query accuracy and efficiency. To achieve this, we propose a cost model that estimates the composition of active and passive updates based on historical motion records and query distribution. More system performance improvements can be obtained by cutting more updates from the clients, when the users of system are comfortable with incomplete but accuracy bounded query results. We have conducted extensive experiments to evaluate our proposal on a variety of popular indexing structures. The results confirm the viability, robustness, accuracy and efficiency of our proposed protocol.  相似文献   

16.
17.
We propose a new way of indexing a large database of small and medium-sized graphs and processing exact subgraph matching (or subgraph isomorphism) and approximate (full) graph matching queries. Rather than decomposing a graph into smaller units (e.g., paths, trees, graphs) for indexing purposes, we represent each graph in the database by its graph signature, which is essentially a multiset. We construct a disk-based index on all the signatures via bulk loading. During query processing, a query graph is also mapped into its signature, and this signature is searched using the index by performing multiset operations. To improve the precision of exact subgraph matching, we develop a new scheme using the concept of line graphs. Through extensive evaluation on real and synthetic graph datasets, we demonstrate that our approach provides a scalable and efficient disk-based solution for a large database of small and medium-sized graphs.  相似文献   

18.
19.
An efficient color and texture based iris image retrieval technique   总被引:1,自引:0,他引:1  
This paper proposes a hierarchical approach to retrieve an iris image efficiently from for a large iris database. This approach is a combination of both iris color and texture. Iris color is used for indexing and texture is used for retrieval of iris images from the indexed iris database. An index which is determined from the iris color is used to filter out the images that are not similar to the query image in color. Further, iris texture features of those filtered images, are used to determine the images which are similar to the query image. The iris color information helps to design an efficient indexing scheme based on color indices. The color indices are computed by averaging the intensity values of all red and blue color pixels. Kd-tree is used for real-time indexing based on color indices. The iris texture features are obtained through Speeded Up Robust Features (SURF) algorithm. These features are used to get the query’s corresponding image at the top best match. The performance of the proposed indexing scheme is compared with two well known iris indexing schemes ( [Mehrotra et al., 2010] and [Puhan and Sudha, 2008]) on UPOL (Dobeš, Machala, Tichavský, & Posp?´šil, 2004) and UBIRIS (Proencca & Alexandre, 2005) iris databases. It is observed that combination of iris color and texture improves the performance of iris recognition system.  相似文献   

20.
Points, lines, and regions are the three basic entities for constituting vector-based objects in spatial databases. Many indexing methods (G-tree, K-D-B tree, Quad-tree, PMR-tree, Grid-file, R-tree, and so on) have been widely discussed for handling point or region data. These traditional methods can efficiently organize point or region objects in a space into a hashing or hierarchical directory. They provide efficient access methods to meet the requirement of accurate retrievals. However, two problems are encountered when their techniques are applied to deal with line segments. The first is that representing line segments by means of point or region objects cannot exactly and properly preserve the spatial information about the proximities of line segments. The second problem is derived from the large dead space and overlapping areas in external and internal nodes of the hierarchical directory caused by the use of rectangles to enclose line objects. In this paper, we propose an indexing structure for line segments based on B + -tree to remedy these two problems. Through the experimental results, we demonstrate that our approach has significant improvement over the storage efficiency. In addition, the retrieval efficiency has also been significantly prompted as compared to the method using R-tree index scheme. These improvements derive mainly from the proposed data processing techniques and the new indexing method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号