首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
SQLite是开放码源的嵌入式关系数据库,它采用的是B树存储结构。针对B树处理庞大数据时效率低的问题,提出了一种用红黑树代替B树的数据库索引机制的优化方案。总结了红黑树处理数据时的优点,构建了基于红黑树的SQLite数据库模型。通过大量实验比较了红黑树和B树的性能,结果表明,红黑树插入和删除的操作效率远远高于B树,可以实现对SQLite索引的优化。  相似文献   

2.
In anti-virus and anti-spyware applications, due to multiplicative increase in processing times with increasing complexity of detection logic and fast growing number of signatures, there is a necessity for data structures for quick retrieval and efficient storage of large collection of signatures. This paper presents a variant of B-tree data structure, where the minimum degree constraint is relaxed while maintaining the order of worst case performance bounds for primitive search, insert and delete operations of the B-tree. It presents a detailed case study of the impact of key (signature) size on storage utilization, given fixed sized nodes and also derives a maximum optimal key size with respect to node size. This variant of B-tree is found to be specifically very useful for storage of large number of keys where size of keys exhibit a wide variation and node size remains fixed.  相似文献   

3.
数据库索引是用于提高数据检索速度的关键数据结构,该文结合常用的数据库索引结构B树,分析索引的原理,并结合外存储的原理,分析大多数数据库使用B+树作为索引结构的原因,并结合My SQL数据库中Inno DB存储引擎中的索引实现,分析其优缺点。  相似文献   

4.
An algorithm to obtain the number of different values that appear a specified number of times in a given data field of a given data file is presented. Basically, a well-known B-tree structure is employed in this study. Some modifications were made to the basic B-tree algorithm. The first step of the modifications is to allow a data item whose values are not necessary distinct from one record to another to be used as a primary key. When a key value is inserted, the number of previous appearances is counted. At the end of all the insertions, the number of key values which are unique in the tree, the number of key values which appear twice, three times, and so forth are obtained. This algorithm is especially powerful for a large size file in disk storage.  相似文献   

5.
提出了将StringB-tree用于解决软件复用中的参数化样式匹配问题(parameterizedpatternmatching)。通过对参数化字符串做一个变换,使用StringB-tree这种特殊的数据结构可提高匹配效率。文章的重点有两部分,一个是介绍了StringB-tree这种特殊的数据结构的优点及其构建过程;另一个是讲怎样利用StringB-tree解决参数化样式匹配问题。  相似文献   

6.
Dynamic interval index structure in constraint database systems   总被引:1,自引:0,他引:1       下载免费PDF全文
Interval index structure plays an important role in constraint database systems.A dynamic interval index structure DM-tree is presented in this paper.The advantage of the DM-tree compared with other interval index structures is that the dynamic operations of insertion and deletion can be operated on the new structure.The storage complexity of the tree is O(n),and the query I/O complexity is O(longn t/B).To improve the performance of the inserting and deleting operations some methods such as neighbored-constraint and update-late are applied.The I/O complexity of inserting and deleting operations is the same as that in B-tree.i.e.,.O(logn).  相似文献   

7.
G-tree: a new data structure for organizing multidimensional data   总被引:4,自引:0,他引:4  
The author describes an efficient data structure called the G-tree (or grid tree) for organizing multidimensional data. The data structure combines the features of grids and B-trees in a novel manner. It also exploits an ordering property that numbers the partitions in such a way that partitions that are spatially close to one another in a multidimensional space are also close in terms of their partition numbers. This structure adapts well to dynamic data spaces with a high frequency of insertions and deletions, and to nonuniform distributions of data. We demonstrate that it is possible to perform insertion, retrieval, and deletion operations, and to run various range queries efficiently using this structure. A comparison with the BD tree, zkdb tree and the KDB tree is carried out, and the advantages of the G-tree over the other structures are discussed. The simulated bucket utilization rates for the G-tree are also reported  相似文献   

8.
A wide range of applications require that large quantities of data be maintained in sort order on disk. The B-tree, and its variants, are an efficient general-purpose disk-based data structure that is almost universally used for this task. The B-trie has the potential to be a competitive alternative for the storage of data where strings are used as keys, but has not previously been thoroughly described or tested. We propose new algorithms for the insertion, deletion, and equality search of variable-length strings in a disk-resident B-trie, as well as novel splitting strategies which are a critical element of a practical implementation. We experimentally compare the B-trie against variants of B-tree on several large sets of strings with a range of characteristics. Our results demonstrate that, although the B-trie uses more memory, it is faster, more scalable, and requires less disk space.  相似文献   

9.
Summary We present a practical and efficient model for the estimation of average performance measures of B-trees under dynamic conditions of insertions and deletions. Performance measures computed are average storage utilization, average path length, and average tree height. The model introduces a data structure, called a lineage tree, which permits a highly compact representation of B-trees while still retaining information needed to compute the above performance measures. The model then involves a Markov chain in which the states are lineages obtained from the lineage tree. Probabilities, based on the number of B-tree structures corresponding to each lineage, are derived for the transition from one lineage to another under certain dynamic conditions. Results are given for tree orders ranging from 5 up to 401, and for numbers of keys up to 140000. Computer requirements are shown to be small to moderate.  相似文献   

10.
In a variety of applications, we need to keep track of the development of a data set over time. For maintaining and querying these multiversion data efficiently, external storage structures are an absolute necessity. We propose a multiversion B-tree that supports insertions and deletions of data items at the current version and range queries and exact match queries for any version, current or past. Our multiversion B-tree is asymptotically optimal in the sense that the time and space bounds are asymptotically the same as those of the (single-version) B-tree in the worst case. The technique we present for transforming a (single-version) B-tree into a multiversion B-tree is quite general: it applies to a number of hierarchical external access structures with certain properties directly, and it can be modified for others.  相似文献   

11.
Three information retrieval storage structures are considered to determine their suitability for a World Wide Web search engine: The Wolverhampton Web Library — The Next Generation. The structures are an inverted file, signature file and Pat tree. A number of implementations are considered for each structure. For the index of an inverted file a sorted array, B-tree, B+-tree, trie and hash table are considered. For the signature file vertical and horizontal partitioning schemes are considered and for the Pat tree a tree and array implementation are considered. A theoretical comparison of the structures is done on seven criteria that include: response time, support for results ranking, search techniques, file maintenance, efficient use of disk space (including the use of compression), scalability and extensibility. The comparison reveals that an inverted file is the most suitable structure, unlike the signature file and Pat tree, which encounter problems with very large corpora.  相似文献   

12.
Database applications very often require a sophisticated class of storage structures in order to answer different types of queries efficiently. This often dictates that the file should be organized on multiple keys. Several storage structures have been proposed to satisfy these needs. Most of these are a generalization of the storage structures used for managing one-dimensional data. Thek-d tree is one such example and it is a natural generalization of the standard one-dimensional binary search tree. Recently, a new storage structure, called theBD tree, was proposed to manage multidimensional data. This structure has good dynamic characteristics. Several variations are possible on the basick-d tree structure. This paper studies the performance implications of three variations. Further, it provides an empirical performance comparison of thek-d tree andBD tree in database applications.  相似文献   

13.
索引是数据库的对象之一,在关系数据库中,索引建立在一张基本表的一列或多列上,索引的逻辑结构是一张二维表,索引表由两类信息组成,一是索引关键字,即在基本表上经常查询的一列或多列属性,二是地址信息,即索引关键字在基本表中所在行的物理地址;索引的物理结构以B树形式组织。按照对基本表的组织方式,索引分为聚集索引和非聚集索引;按照索引关键字取值的唯一性,分为唯一索引和不唯一索引。文章着重探讨聚集索引及其B树结构,用实例分析二维表的B树索引的创建,在B树结构上的查询和更新操作,形象说明索引是如何提高查询效率的,以及进行更新操作时对索引的影响。  相似文献   

14.
联机分析多维存储结构的研究   总被引:1,自引:0,他引:1  
联机分析使用多维数组作为存储结构以加快查询响应时间。为了等同的对待每个维,适应稀疏数据,必须对多维数组进行划分。目前,有两种划分方法。本文分析了它们的优缺点,给出了一种统一的存储结构,实验结果表明,为了达到转换时间短和压缩比高的目的,要选择合适的划分向量和数据块体积。  相似文献   

15.
Database applications often require a sophisticated class of storage structures in order to answer different types of queries efficiently. This often dictates that the file should be organized on multiple keys. Several storage structures have been proposed to satisfy these needs. Most are generalizations of the storage structures used for managing one-dimensional data. Recently, a new storage structure, called the BD tree, was proposed to manage multidimensional data. This structure has good dynamic characteristics. This paper presents algorithms for the BD tree to perform insertion, deletion, and to answer exact match, partial match and range queries. In addition, some experimental evidence is presented that suggests that BD trees have good dynamic characteristics.  相似文献   

16.
We propose a new efficient indexing scheme, called the HG-tree, to support content-based retrieval in image databases. Image content is represented by a point in a multidimensional feature space. The types of queries considered are the range query and the nearest-neighbor query, both in a multidimensional space. Our goals are twofold: increasing the storage utilization and decreasing the area covered by the directory regions of the index tree. The high storage utilization and the small directory area reduce the number of nodes that have to be touched during the query processing. The first goal is achieved by suppressing node splitting if possible, and when splitting is necessary, converting two nodes into three. This is done by proposing a good ordering on the directory nodes. The second goal is achieved by maintaining the area occupied by the directory region as small as possible. This is done by introducing the smallest interval that encloses all regions of the lower nodes. We note that there is a trade-off between the two design goals, but the HG-tree is so flexible that it can control the trade-off to some extent. We present the design of our indexing scheme and associated algorithms. In addition, we report the results of a series of tests, comparing the proposed index tree with the buddy-tree, which is one of the most successful point indexing schemes for a multidimensional space. The results show the superiority of our method.  相似文献   

17.
基于多维数据库的MOLAP存储及查询技术研究   总被引:1,自引:0,他引:1  
与关系数据库相比,基于多维数组的多维数据库更适合表示和存储多维数据。文章研究了基于多维数据库的MOLAP数据存储组织方法,主要讨论了三个方面内容:多维数据模式、MDDB(MultidimensionalDataBase)的数据存储结构及MOLAP查询分析。  相似文献   

18.
A typical class of structures to organize ordered files is multiway trees, among which the most widely used is the perfectly balanced B-tree. In this paper we present the new family of BMT multiway trees, which are kept balanced in height, similarly to the classical binary height balanced trees used in central memory. The height of a BMT, that is the maximum search length for a key, is shown to be a logarithmic function of the number of keys in the worst case. Updating a BMT by key insertion is studied, and a technique to keep the tree balanced is presented. A comparison between the performance of BMTs and B-trees leads to the conclusion that the two structures are roughly comparable as to search length for a key, while BMTs require less memory space than B-trees for small node sizes. The real difference between BMTs and B-tree is in rebalancing operation, which requires a work proportional to the node size in the BMTs and to the tree height in the B-tree.  相似文献   

19.
Visual analytics of multidimensional multivariate data is a challenging task because of the difficulty in understanding metrics in attribute spaces with more than three dimensions. Frequently, the analysis goal is not to look into individual records but to understand the distribution of the records at large and to find clusters of records with similar attribute values. A large number of (typically hierarchical) clustering algorithms have been developed to group individual records to clusters of statistical significance. However, only few visualization techniques exist for further exploring and understanding the clustering results. We propose visualization and interaction methods for analyzing individual clusters as well as cluster distribution within and across levels in the cluster hierarchy. We also provide a clustering method that operates on density rather than individual records. To not restrict our search for clusters, we compute density in the given multidimensional multivariate space. Clusters are formed by areas of high density. We present an approach that automatically computes a hierarchical tree of high density clusters. To visually represent the cluster hierarchy, we present a 2D radial layout that supports an intuitive understanding of the distribution structure of the multidimensional multivariate data set. Individual clusters can be explored interactively using parallel coordinates when being selected in the cluster tree. Furthermore, we integrate circular parallel coordinates into the radial hierarchical cluster tree layout, which allows for the analysis of the overall cluster distribution. This visual representation supports the comprehension of the relations between clusters and the original attributes. The combination of the 2D radial layout and the circular parallel coordinates is used to overcome the overplotting problem of parallel coordinates when looking into data sets with many records. We apply an automatic coloring scheme based on the 2D radial layout of the hierarchical cluster tree encoding hue, saturation, and value of the HSV color space. The colors support linking the 2D radial layout to other views such as the standard parallel coordinates or, in case data is obtained from multidimensional spatial data, the distribution in object space.  相似文献   

20.
Summary A simple mathematical model for analyzing the dynamics of a B-tree node is presented. From the solution of the model, it is shown that the simple technique of allowing a B-tree node to be slightly less than half full can significantly reduce the rate of split, merge and borrow operations. We call split, merge, borrow and balance operations unsafe operations in this paper. In a multi-user environment, a lower unsafe operation rate implies less blocking and higher throughput, even when tailored concurrency control algorithms (e.g., that proposed by Lehman and Yao [10]) are used. A lower unsafe operation rate also means a longer life time of an optimally initialized B-tree (e.g., compact B-tree). It is in general useful to have an analytical model which can predict the rate of unsafe operations in a dynamic data structure, not only for comparing the behavior of variations of B-trees, but also for characterizing workload for performance evaluation of different concurrency control algorithms for such data structures. The model presented in this paper represents a starting point in this direction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号