首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Speeding up construction of PMR quadtree-based spatial indexes   总被引:5,自引:0,他引:5  
Spatial indexes, such as those based on the quadtree, are important in spatial databases for efficient execution of queries involving spatial constraints, especially when the queries involve spatial joins. In this paper we present a number of techniques for speeding up the construction of quadtree-based spatial indexes, specifically the PMR quadtree, which can index arbitrary spatial data. We assume a quadtree implementation using the “linear quadtree”, a disk-resident representation that stores objects contained in the leaf nodes of the quadtree in a linear index (e.g., a B-tree) ordered based on a space-filling curve. We present two complementary techniques: an improved insertion algorithm and a bulk-loading method. The bulk-loading method can be extended to handle bulk-insertions into an existing PMR quadtree. We make some analytical observations about the I/O cost and CPU cost of our PMR quadtree bulk-loading algorithm, and conduct an extensive empirical study of the techniques presented in the paper. Our techniques are found to yield significant speedup compared to traditional quadtree building methods, even when the size of a main memory buffer is very small compared to the size of the resulting quadtrees. Edited by R. Sacks-Davis. Received: July 10, 2001 / Accepted: March 25, 2002 Published online: September 25, 2002  相似文献   

2.
Real-world entities are inherently spatially and temporally referenced, and database applications increasingly exploit databases that record the past, present, and anticipated future locations of entities, e.g., the residences of customers obtained by the geo-coding of addresses. Indices that efficiently support queries on the spatio-temporal extents of such entities are needed. However, past indexing research has progressed in largely separate spatial and temporal streams. Adding time dimensions to spatial indices, as if time were a spatial dimension, neither supports nor exploits the special properties of time. On the other hand, temporal indices are generally not amenable to extension with spatial dimensions. This paper proposes the first efficient and versatile index for a general class of spatio-temporal data: the discretely changing spatial aspect of an object may be a point or may have an extent; both transaction time and valid time are supported, and a generalized notion of the current time, now, is accommodated for both temporal dimensions. The index is based on the R-tree and provides means of prioritizing space versus time, which enables it to adapt to spatially and temporally restrictive queries. Performance experiments are reported that evaluate pertinent aspects of the index. Edited by T. Sellis. Received: 7 December 2000 / Accepted: 1 September 2001 Published online: 18 December 2001  相似文献   

3.
Numerous applications such as stock market or medical information systems require that both historical and current data be logically integrated into a temporal database. The underlying access method must support different forms of “time-travel” queries, the migration of old record versions onto inexpensive archive media, and high insertion and update rates. This paper presents an access method for transaction-time temporal data, called the log-structured history data access method (LHAM) that meets these demands. The basic principle of LHAM is to partition the data into successive components based on the timestamps of the record versions. Components are assigned to different levels of a storage hierarchy, and incoming data is continuously migrated through the hierarchy. The paper discusses the LHAM concepts, including concurrency control and recovery, our full-fledged LHAM implementation, and experimental performance results based on this implementation. A detailed comparison with the TSB-tree, both analytically and based on experiments with real implementations, shows that LHAM is highly superior in terms of insert performance, while query performance is in almost all cases at least as good as for the TSB-tree; in many cases it is much better. Received March 4, 1999 / Accepted September 28, 1999  相似文献   

4.
Two methods for cloud visualisation from weather simulation data   总被引:2,自引:0,他引:2  
For the local TV presentation of weather forecast data, it is important to have high-quality and fast visualisation of clouds. In this paper, I present methods for the visualisation of clouds from data produced by a meteorological weather simulation. Isosurfaces, which are originally too coarse because of the data grid resolution, are refined and deformed. The resulting geometry is used for cloud visualisation.  相似文献   

5.
Searching in metric spaces by spatial approximation   总被引:5,自引:0,他引:5  
We propose a new data structure to search in metric spaces. A metric space is formed by a collection of objects and a distance function defined among them which satisfies the triangle inequality. The goal is, given a set of objects and a query, retrieve those objects close enough to the query. The complexity measure is the number of distances computed to achieve this goal. Our data structure, called sa-tree (“spatial approximation tree”), is based on approaching the searched objects spatially, that is, getting closer and closer to them, rather than the classic divide-and-conquer approach of other data structures. We analyze our method and show that the number of distance evaluations to search among n objects is sublinear. We show experimentally that the sa-tree is the best existing technique when the metric space is hard to search or the query has low selectivity. These are the most important unsolved cases in real applications. As a practical advantage, our data structure is one of the few that does not need to tune parameters, which makes it appealing for use by non-experts. Edited by R. Sacks-Davis Received: 17 April 2001 / Accepted: 24 January 2002 / Published online: 14 May 2002  相似文献   

6.
Integrated spatial and feature image query   总被引:3,自引:0,他引:3  
Smith  John R.  Chang  Shih-Fu 《Multimedia Systems》1999,7(2):129-140
We present a new system for querying for images by regions and their spatial and feature attributes. The system enables the user to find the images that contain arrangements of regions similar to those diagrammed in a query image. By indexing the attributes of regions, such as sizes, locations and visual features, a wide variety of complex joint spatial and feature queries are efficiently computed. In order to demonstrate the utility of the system, we develop a process for the extracting color regions from photographic images. We demonstrate that integrated spatial and feature querying using color regions improves image search capabilities over non-spatial content-based image retrieval methods.  相似文献   

7.
8.
In the past decade, advances in the speed of commodity CPUs have far out-paced advances in memory latency. Main-memory access is therefore increasingly a performance bottleneck for many computer applications, including database systems. In this article, we use a simple scan test to show the severe impact of this bottleneck. The insights gained are translated into guidelines for database architecture, in terms of both data structures and algorithms. We discuss how vertically fragmented data structures optimize cache performance on sequential data access. We then focus on equi-join, typically a random-access operation, and introduce radix algorithms for partitioned hash-join. The performance of these algorithms is quantified using a detailed analytical model that incorporates memory access cost. Experiments that validate this model were performed on the Monet database system. We obtained exact statistics on events such as TLB misses and L1 and L2 cache misses by using hardware performance counters found in modern CPUs. Using our cost model, we show how the carefully tuned memory access pattern of our radix algorithms makes them perform well, which is confirmed by experimental results. Received April 20, 2000 / Accepted June 23, 2000  相似文献   

9.
Efficient similarity search for market basket data   总被引:2,自引:0,他引:2  
Several organizations have developed very large market basket databases for the maintenance of customer transactions. New applications, e.g., Web recommendation systems, present the requirement for processing similarity queries in market basket databases. In this paper, we propose a novel scheme for similarity search queries in basket data. We develop a new representation method, which, in contrast to existing approaches, is proven to provide correct results. New algorithms are proposed for the processing of similarity queries. Extensive experimental results, for a variety of factors, illustrate the superiority of the proposed scheme over the state-of-the-art method. Edited by R. Ng. Received: August 6, 2001 / Accepted: May 21, 2002 Published online: September 25, 2002  相似文献   

10.
Abstract. This paper presents structural recursion as the basis of the syntax and semantics of query languages for semistructured data and XML. We describe a simple and powerful query language based on pattern matching and show that it can be expressed using structural recursion, which is introduced as a top-down, recursive function, similar to the way XSL is defined on XML trees. On cyclic data, structural recursion can be defined in two equivalent ways: as a recursive function which evaluates the data top-down and remembers all its calls to avoid infinite loops, or as a bulk evaluation which processes the entire data in parallel using only traditional relational algebra operators. The latter makes it possible for optimization techniques in relational queries to be applied to structural recursion. We show that the composition of two structural recursion queries can be expressed as a single such query, and this is used as the basis of an optimization method for mediator systems. Several other formal properties are established: structural recursion can be expressed in first-order logic extended with transitive closure; its data complexity is PTIME; and over relational data it is a conservative extension of the relational calculus. The underlying data model is based on value equality, formally defined with bisimulation. Structural recursion is shown to be invariant with respect to value equality. Received: July 9, 1999 / Accepted: December 24, 1999  相似文献   

11.
Traditional algorithms for optimizing the execution order of joins are no more valid when selections and projections involve methods and become very expensive operations. Selections and projections could be even more costly than joins such that they are pulled above joins, rather than pushed down in a query tree. In this paper, we take a fundamental look at how to approach query optimization from a top-down design perspective, rather than trying to force one model to fit into another. We present a graph model which is designed to characterize execution plans. Each edge and each vertex of the graph is assigned a weight to model execution plans. We also design algorithms that use these weights to optimize the execution order of operations. A cost model of these algorithms is developed. Experiments are conducted on the basis of this cost model. The results show that our algorithms are superior to similar work proposed in the literature. Received 20 April 1999 / Accepted 9 August 2000 Published online 20 April 2001  相似文献   

12.
We present a high-performance solution to the I/O retrieval problem in a distributed multimedia system. Parallelism of data retrieval is achieved by striping the data across multiple disks. We identify the components that contribute to media data-retrieval delay. The variable delays among these have a great bearing on the server throughput under varying load conditions. We present a buffering scheme to minimize these variations. We have implemented our model on the Intel Paragon parallel computer. The results of component-wise instrumentation of the server operation are presented and analyzed. Experimental results that demonstrate the efficacy of the buffering scheme are presented. Based on our experiments, a dynamic admission-control policy that takes server workloads into account is proposed.  相似文献   

13.
Symbolic images are composed of a finite set of symbols that have a semantic meaning. Examples of symbolic images include maps (where the semantic meaning of the symbols is given in the legend), engineering drawings, and floor plans. Two approaches for supporting queries on symbolic-image databases that are based on image content are studied. The classification approach preprocesses all symbolic images and attaches a semantic classification and an associated certainty factor to each object that it finds in the image. The abstraction approach describes each object in the symbolic image by using a vector consisting of the values of some of its features (e.g., shape, genus, etc.). The approaches differ in the way in which responses to queries are computed. In the classification approach, images are retrieved on the basis of whether or not they contain objects that have the same classification as the objects in the query. On the other hand, in the abstraction approach, retrieval is on the basis of similarity of feature vector values of these objects. Methods of integrating these two approaches into a relational multimedia database management system so that symbolic images can be stored and retrieved based on their content are described. Schema definitions and indices that support query specifications involving spatial as well as contextual constraints are presented. Spatial constraints may be based on both locational information (e.g., distance) and relational information (e.g., north of). Different strategies for image retrieval for a number of typical queries using these approaches are described. Estimated costs are derived for these strategies. Results are reported of a comparative study of the two approaches in terms of image insertion time, storage space, retrieval accuracy, and retrieval time. Received June 12, 1998 / Accepted October 13, 1998  相似文献   

14.
A distributed dynamic channel allocation algorithm has been proposed in [11]. In this paper the algorithm is modelled using predicate/transition nets. The same model can be used for any number of cell and channel configurations. The Maria reachability analyser has been used to analyse the protocol for some configurations. These are deadlock-free and are shown to satisfy the requirement that the same channel is never allocated to two neighbouring cells. The suitability of high-level nets for the modelling and analysis of distributed algorithms is discussed. Published online: 24 August 2001  相似文献   

15.
Spatial indexing of high-dimensional data based on relative approximation   总被引:2,自引:0,他引:2  
We propose a novel index structure, the A-tree (approximation tree), for similarity searches in high-dimensional data. The basic idea of the A-tree is the introduction of virtual bounding rectangles (VBRs) which contain and approximate MBRs or data objects. VBRs can be represented quite compactly and thus affect the tree configuration both quantitatively and qualitatively. First, since tree nodes can contain a large number of VBR entries, fanout becomes large, which increases search speed. More importantly, we have a free hand in arranging MBRs and VBRs in the tree nodes. Each A-tree node contains an MBR and its children VBRs. Therefore, by fetching an A-tree node, we can obtain information on the exact position of a parent MBR and the approximate position of its children. We have performed experiments using both synthetic and real data sets. For the real data sets, the A-tree outperforms the SR-tree and the VA-file in all dimensionalities up to 64 dimensions, which is the highest dimension in our experiments. Additionally, we propose a cost model for the A-tree. We verify the validity of the cost model for synthetic and real data sets. Edited by T. Sellis. Received: December 8, 2000 / Accepted: March 20, 2002 Published online: September 25, 2002  相似文献   

16.
We describe a novel approach for clustering collections of sets, and its application to the analysis and mining of categorical data. By “categorical data,” we mean tables with fields that cannot be naturally ordered by a metric – e.g., the names of producers of automobiles, or the names of products offered by a manufacturer. Our approach is based on an iterative method for assigning and propagating weights on the categorical values in a table; this facilitates a type of similarity measure arising from the co-occurrence of values in the dataset. Our techniques can be studied analytically in terms of certain types of non-linear dynamical systems. Received February 15, 1999 / Accepted August 15, 1999  相似文献   

17.
In video processing, a common first step is to segment the videos into physical units, generally called shots. A shot is a video segment that consists of one continuous action. In general, these physical units need to be clustered to form more semantically significant units, such as scenes, sequences, programs, etc. This is the so-called story-based video structuring. Automatic video structuring is of great importance for video browsing and retrieval. The shots or scenes are usually described by one or several representative frames, called key-frames. Viewed from a higher level, key frames of some shots might be redundant in terms of semantics. In this paper, we propose automatic solutions to the problems of: (i) video partitioning, (ii) key frame computing, (iii) key frame pruning. For the first problem, an algorithm called “net comparison” is devised. It is accurate and fast because it uses both statistical and spatial information in an image and does not have to process the entire image. For the last two problems, we develop an original image similarity criterion, which considers both spatial layout and detail content in an image. For this purpose, coefficients of wavelet decomposition are used to derive parameter vectors accounting for the above two aspects. The parameters exhibit (quasi-) invariant properties, thus making the algorithm robust for many types of object/camera motions and scaling variances. The novel “seek and spread” strategy used in key frame computing allows us to obtain a large representative range for the key frames. Inter-shot redundancy of the key-frames is suppressed using the same image similarity measure. Experimental results demonstrate the effectiveness and efficiency of our techniques.  相似文献   

18.
Abstract. The paper proposes a new method for efficient triangulation of large, unordered sets of 3D points using a CAD model comprising NURBS entities. It is primarily aimed at engineering applications involving analysis and visualisation of measured data, such as inspection, where a model of the object in question is available. Registration of the data to the model is the necessary first step, enabling the triangulation to be efficiently performed in 2D, on the projections of the measured points onto the model entities. The derived connectivity is then applied to the original 3D data. Improvement of the generated 3D mesh is often necessary, involving mesh smoothing, constraint-based elimination of redundant triangles and merging of mesh patches. Examples involving random measurements on aerospace and automotive free-form components are presented. Received: 30 August 1999 / Accepted: 10 January 2000  相似文献   

19.
The GMAP: a versatile tool for physical data independence   总被引:1,自引:0,他引:1  
Physical data independence is touted as a central feature of modern database systems. It allows users to frame queries in terms of the logical structure of the data, letting a query processor automatically translate them into optimal plans that access physical storage structures. Both relational and object-oriented systems, however, force users to frame their queries in terms of a logical schema that is directly tied to physical structures. We present an approach that eliminates this dependence. All storage structures are defined in a declarative language based on relational algebra as functions of a logical schema. We present an algorithm, integrated with a conventional query optimizer, that translates queries over this logical schema into plans that access the storage structures. We also show how to compile update requests into plans that update all relevant storage structures consistently and optimally. Finally, we report on experiments with a prototype implementation of our approach that demonstrate how it allows storage structures to be tuned to the expected or observed workload to achieve significantly better performance than is possible with conventional techniques. Edited by Matthias Jarke, Jorge Bocca, Carlo Zaniolo. Received September 15, 1994 / Accepted September 1, 1995  相似文献   

20.
Two methods for stroke segmentation from a global point of view are presented and compared. One is based on thinning methods and the other is based on contour curve fitting. For both cases an input image is binarized. For the former, Hilditch's method is used, then crossing points are sought, around which a domain is constructed. Outside the domain, a set of line segments are identified. These lines are connected and approximated by cubic B-spline curves. Smoothly connected lines are selected as segmented curves. This method works well for a limited class of crossing lines, which are shown experimentally. In the latter, a contour line is approximated by cubic B-spline curve, along which curvature is measured. According to the extreme points of the curvature graph, the contour line is segmented, based on which the line segment is obtained. Experimental results are shown for some difficult cases. Received October 31, 1998 / Revised January 12, 1999  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号