首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Multidimensional data structures have proved to be efficient storage structures for multidimensional dynamic data bases. A judicious choice of a sequence of discriminators can greatly reduce search costs for partial match queries. This paper proposes a scheme for optimal choice of sequence of discriminators for the K-D-B tree which takes the query statistics into account. The scheme is later extended to take into account various system parameters like seek time, block size, disk transfer rates and page placement policy. The problem of choosing the sequence is formulated as a 0–1 integer programming problem, the solution to which gives the required sequence of discriminators.  相似文献   

2.
Database applications often require a sophisticated class of storage structures in order to answer different types of queries efficiently. This often dictates that the file should be organized on multiple keys. Several storage structures have been proposed to satisfy these needs. Most are generalizations of the storage structures used for managing one-dimensional data. Recently, a new storage structure, called the BD tree, was proposed to manage multidimensional data. This structure has good dynamic characteristics. This paper presents algorithms for the BD tree to perform insertion, deletion, and to answer exact match, partial match and range queries. In addition, some experimental evidence is presented that suggests that BD trees have good dynamic characteristics.  相似文献   

3.
Given a collection of points in the plane, pick an arbitrary horizontal segment and move it vertically until it hits one of the points (if at all). This form ofsegment-dragging is a common operation in computer graphics and motion-planning, it can also serve as a building block for multidimensional data structures. This note describes a new approach to segment-dragging which yields a simple and efficient solution. The data structure requiresO(n) storage andO(n logn) preprocessing time, and each query can be answered inO(logn) time, wheren is the number of points in the collection. The method is best understood as the end result of a sequence of transformations applied to a simple but inefficient starting solution.This work was started while the author was a visiting professor at Ecole Normale Supérieure, Paris, France.  相似文献   

4.
An efficient peer-to-peer indexing tree structure for multidimensional data   总被引:4,自引:1,他引:3  
As one of the most important technologies for implementing large-scale distributed systems, peer-to-peer (P2P) computing has attracted much attention in both research and industrial communities, for its advantages such as high availability, high performance, and high flexibility to the dynamics of networks. However, multidimensional data indexing remains as a big challenge to P2P computing, because of the inefficiency in search and network maintenance caused by the complicated existing index structures, which greatly limits the scalability of applications and dimensionality of the data to be indexed.We propose SDI (Swift tree structure for multidimensional Data Indexing), a swift index scheme with a simple tree structure for multidimensional data indexing in large-scale distributed systems. While keeping the query efficiency in O(logN) in terms of routing hops, SDI has extremely low maintenance costs which is proved through theoretical analysis. Furthermore, SDI overcomes the root-bottleneck problem existing in most other tree-based distributed indexing systems. Extensive empirical study verifies the superiority of SDI in both query and maintenance performance.  相似文献   

5.
Object-oriented databases (OODBs) provide an effective means for capturing complex data and semantic relationships underlying many real-world database applications. Because users' interactions with databases have increased significantly in today's era of client–server computing, it is important to examine users' ability to interact with such databases. We investigated a number of factors that potentially affect performance in writing queries on an OODB. First, we evaluated the utility of graphical and textual schemas associated with emerging OODBs from the perspective of database querying. Second, we examined the use of two different strategies (navigation and join) that could be used in writing OODB queries. Third, we examined a number of factors that potentially contribute to the complexity of an OODB query.Our exploratory study examined the performance of 20 graduate students in an experiment in which each participant wrote queries for two problems, one using a graphical OODB schema and the other a textual OODB schema. The participants had no prior exposure to the object-oriented data model. We found that there was no difference in query writing performance (either accuracy or time) using the graphical and textual schemas. Examination of query strategy revealed that a significant number of participants used a join strategy, rather than the navigation strategy that matches the database structure. Use of the join strategy resulted in significantly less accurate and slower query writing than did the navigation strategy. From the viewpoint of complexity, the number of objects referenced in a query, the number of starting points in the from clause, and the presence of special operators influenced both the accuracy and time of query writing.  相似文献   

6.
Approximate range aggregate queries are one of the most frequent and useful kinds of queries for Decision Support Systems (DSS), as they are widely used in many data analysis tasks. Traditionally, sampling-based techniques have been proposed to tackle this problem. However, their effectiveness degrade when the underlying data distribution is skewed. Another approach based on the outlier management can limit the effect of data skews but fails to address other requirements of approximate range aggregate queries, such as error guarantees and query processing efficiency. In this paper, we present a technique that provides approximate answers to range aggregate queries on OLAP data cubes efficiently, with theoretical guarantees on the errors. Our basic idea is to build different data structures to manage outliers and the rest of the data. Carefully chosen outliers are organized in a quad-tree based indexing data structure to provide efficient access for query processing. A query-workload adaptive, tree-like synopsis data structure, called T unable P artition-Tree (TP-Tree), is proposed to organize samples extracted from non-outlier data. Our experiments clearly demonstrate the merits of our technique, by comparing with previous well-known techniques.  相似文献   

7.
8.
Y. Nekrich 《Algorithmica》2007,49(2):94-108
In this paper we present new space efficient dynamic data structures for orthogonal range reporting. The described data structures support planar range reporting queries in time O(log n+klog log (4n/(k+1))) and space O(nlog log n), or in time O(log n+k) and space O(nlog  ε n) for any ε>0. Both data structures can be constructed in O(nlog n) time and support insert and delete operations in amortized time O(log 2 n) and O(log nlog log n) respectively. These results match the corresponding upper space bounds of Chazelle (SIAM J. Comput. 17, 427–462, 1988) for the static case. We also present a dynamic data structure for d-dimensional range reporting with search time O(log  d−1 n+k), update time O(log  d n), and space O(nlog  d−2+ε n) for any ε>0. The model of computation used in our paper is a unit cost RAM with word size log n. A preliminary version of this paper appeared in the Proceedings of the 21st Annual ACM Symposium on Computational Geometry 2005. Work partially supported by IST grant 14036 (RAND-APX).  相似文献   

9.
不确定数据的查询处理是数据库领域近年来的热点研究课题.提出一种不确定数据上的范围受限的最近邻查询.给定不确定数据集D={o1,o2,…,on},范围约束R是一个简单多边形,q为一固定的查询点,范围受限的最近邻查询返回的是在数据集D中,既满足范围约束R,又能成为查询点q的最近邻的对象集合.为处理该查询,提出了范围受限的最近邻核心集的概念和范围受限的最近邻核心集的查找算法.并提出一种计算范围受限的最近邻候选集的优化方法,降低了查询代价.最后通过实验验证了该算法的有效性.  相似文献   

10.
An algorithm is presented to answer window queries in a quadtree-based spatial database environment by retrieving all of the quadtree blocks in the underlying spatial database that cover the quadtree blocks that comprise the window. It works by decomposing the window operation into sub-operations over smaller window partitions. These partitions are the quadtree blocks corresponding to the window. Although a block b in the underlying spatial database may cover several of the smaller window partitions, b is only retrieved once rather than multiple times. This is achieved by using an auxiliary main memory data structure called the active border which requires O(n) additional storage for a window query of size n×n. As a result, the algorithm generates an optimal number of disk I/O requests to answer a window query (i.e., one request per covering quadtree block). A proof of correctness and an analysis of the algorithm's execution time and space requirements are given, as are some experimental results.  相似文献   

11.
We extend the General Entity Manipulator (GEM) language with a facility for defining multidatabase views which we call global views. Our language permits global entity types to be defined with the full features of GEM entity types including generalization, entity-valued attributes, and set-valued attributes. The language supports the definition of entity mappings which define how global entity occurrences are materialized from local entity occurrences and attribute mappings which define conversions between local and global attributes. Entity level mappings are defined with a GEM data retrieval expression or the outer join operator. Attribute mappings are defined with a rich variety of conversion techniques such as string operators, database queries, and pre-compiled procedures. The language also accommodates the initial definition of a global view as well as the maintenance of an existing global view.  相似文献   

12.
提出了一种基于查询树匹配的查询重用算法.首先,系统中原有查询树与新生成的查询树进行匹配并计算对新查询树的重用收益;然后根据重用收益来实现重叠的查询操作的重用.实验结果表明,该算法能够有效地减少连续查询的执行代价总量.  相似文献   

13.
为了能有效地实现网络中移动对象的过去、当前和将来轨迹的查询,提出了一种L2R索引,它由两层R树和一个链表结构组成。两层R树用以索引道路网络和移动对象过去的运动,对象当前的位置和将来的预测轨迹信息保存在链表中。L2R索引不仅可以支持网络中的移动对象的轨迹查询,尤其是可方便的在纵向链表中查询在同条路线上的所有对象。在此索引基础上文中实施了对移动对象的范围查询和点查询,最后通过实验表明L2R结构的索引和查询性能均要优越于TPR树。  相似文献   

14.
15.
空间数据仓库有效地支持对空间数据的管理和分析,提供更加全面的决策支持.讨论了一种有效的空间决策支持手段——空间区域聚集查询的实现.基于aggregate cubetree和aR-tree提出了一个可以有效地在空间维和非空间维上进行区域聚集查询的索引结构aCR-tree及其相关算法,并计算分析了查询算法的时间复杂度.与现有技术相比aCR-tree降低了存储代价和每次查询访问的节点数,通过实验证明,该索引结构可以提供较好的存储性能和查询性能.  相似文献   

16.
This paper describes the theoretical framework and implementation of a database management system for storing and manipulating diverse probability distributions of discrete random variables with finite domains, and associated information. A formal Semistructured Probabilistic Object (SPO) data model and a Semistructured Probabilistic Query Algebra (SP-algebra) are proposed. The SP-algebra supports standard database queries as well as some specific to probabilities, such as conditionalization and marginalization. Thus, the Semistructured Probabilistic Database may be used as a backend to any application that involves the management of large quantities of probabilistic information, such as building stochastic models. The implementation uses XML encoding of SPOs to facilitate communication with diverse applications. The database management system has been implemented on top of a relational DBMS. The translation of SP-algebra queries into relational queries are discussed here, and the results of initial experiments evaluating the system are reported. Work performed while a Ph.D. student at the University of Kentucky.  相似文献   

17.
用于数据仓储的一种改进的多维存储结构   总被引:7,自引:2,他引:7  
冯建华  蒋旭东  周立柱 《软件学报》2002,13(8):1423-1429
对于数据仓库中数据的物理存储组织,目前主要有关系和多维数组两种方式.这两种方式各有自己的优缺点,从提高联机分析处理(online analytical processing,简称OLAP)查询处理性能的角度出发,多维数组方式相对较优,目的主要是解决数据仓库的多维存储结构问题.针对当前多维数组存储组织方式存在的一些问题,提出了Cube(立方体)逻辑存储和物理存储的概念,首先将原多维数据空间划分为逻辑子空间,逻辑块再划分为多个物理块.在物理存储时充分考虑了多维数组的大容量和高稀疏度的问题,并采用新的多维数组的分布和压缩方法.这些概念和方法有效地解决了维内部层次结构的聚集操作和Cube操作的效率问题,显著提高了涉及维内部层次的聚集查询的响应速度,同时还解决了增量维护的效率问题.  相似文献   

18.
针对树形空间索引中多路查询及未考虑时间维索引的问题,提出一种结合时间和聚类结果的Hilbert-R树索引构建策略。首先,按照数据采集的周期划分时空数据集,并在此基础上建立时间索引,通过Hilbert曲线对空间数据进行分割编码,将空间坐标映射到一维区间;其次,依据数据要素在空间中的分布,采用动态确定K值的聚类算法,结合聚类结果构建高效的Hilbert-R树空间索引;最后,基于Redis几种常见的键值数据结构,对时空数据的时间属性和聚类结果构建分级索引。在时空范围及目标矢量对象查询的实验中,与缓存敏感R+树(CCR+)相比,所提算法可有效减少时间开销,查询时间平均缩短约25%,对不同密集型数据具有良好的适应性,可更好地支持Redis应用于海量时空数据查询。  相似文献   

19.
An Overview of Data Mining and Knowledge Discovery   总被引:9,自引:0,他引:9       下载免费PDF全文
With massive amounts of data stored in databases,mining information and knowledge in databases has become an important issue in recent research.Researchers in many different fields have shown great interest in date mining and knowledge discovery in databases.Several emerging applications in information providing services,such as data warehousing and on-line services over the Internet,also call for various data mining and knowledge discovery tchniques to understand used behavior better,to improve the service provided,and to increase the business opportunities.In response to such a demand,this article is to provide a comprehensive survey on the data mining and knowledge discorvery techniques developed recently,and introduce some real application systems as well.In conclusion,this article also lists some problems and challenges for further research.  相似文献   

20.
We study the practical behavior of different algorithms and methods that aim to estimate the intrinsic dimension (IDim) in metric spaces. Some of them were specifically developed to evaluate the complexity of searching in metric spaces, based on different theories related to the distribution of distances between objects on such spaces. Others were originally designed for vector spaces only, and have been extended to general metric spaces. To empirically evaluate the fitness of various IDim estimations with the actual difficulty of searching in metric spaces, we compare two representatives of each of the broadest families of metric indices: those based on pivots and those based on compact partitions. Our conclusions are that the estimators Distance Exponent and Correlation fit best their purpose.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号